[4/29] Object class recognition by unsupervised scale-invariant learning

0 意見

[4/22] Photo Tourism: Exploring Photo Collection in 3D

0 意見

[4/22] Creating and Exploring a Large Photorealistic Virtual Space

0 意見

[4/15] World Explorer: Visualizing Aggregate Data from. Unstructured Text in Geo-Referenced Collections

0 意見
A.Abstract

The availability of map interfaces and location-aware devices makes a growing amount of unstructured, geo-referenced information available on the Web.  This type of information
can be valuable not only for browsing, finding and making sense of individual items, but also in aggregate form to help understand  data  trends  and  features.   In  particular,  over twenty million geo-referenced photos are now available on Flickr, a photo-sharing website – the first major collection of its kind.   These photos are often associated with user-entered unstructured text labels (i.e., tags).  We show how we analyze the tags associated with the geo-referenced Flickr images to generate aggregate knowledge in the form of “representative tags”  for arbitrary areas in the world.  We use these tags to create a visualization tool,  World Explorer, that can help expose the content of the data, using a map interface to display the derived tags and the original photo items.   We perform a qualitative evaluation of World Ex-
plorer that outlines the visualization’s benefits in browsing this type of content.  We provide insights regarding the aggregate versus individual-item requirements in browsing digital geo-referenced material.



B. Notes
Multi-level view on map
cluster label (tag-name cloud) font size depends on cluster size


K-means based on geographical distance, 3<=k<=15 depends on #Photo
TD-IDF for textual scoring


Related works: Flickr & WWMX
only use geo-tag, no textual tag from user


future work: specifying time period (for event) and position (for place) in multi-scale
labels from previous work have time/locality patterns that can be used to classification

[4/15] Tour the World: building a web-scale landmark recognition engine

0 意見
A. Abstract
Modeling and recognizing landmarks at world-scale is a useful yet challenging task. There exists no readily available list of worldwide landmarks. Obtaining reliable visual models for each landmark can also pose problems, and efficiency is another challenge for such a large scale system.
This paper leverages the vast amount of multimedia data on the web, the availability of an Internet image search engine, and advances in object recognition and clustering techniques, to address these issues.
First, a comprehensive list of landmarks is mined from two sources: (1) ∼ 20million GPS-tagged photos and (2) online tour guide web pages. Candidate images for each landmark are then obtained from photo sharing websites or by querying an image search engine.
Second, landmark visual models are built by pruning candidate images using efficient image matching and unsupervised clustering techniques.
Finally, the landmarks and their visual models are validated by checking authorship of their member images. The resulting landmark recognition engine incorporates 5312 landmarks from 1259 cities in 144 countries.
The experiments demonstrate that the engine can deliver satisfactory recognition performance with high efficiency.


B. Notes

Hierarchical Clustering

application
Google Landmark, upload photos to search landmark names, positions, etc.

[4/1] On Spectral Clustering: Analysis and an algorithm

0 意見

[4/1] Normalized Cuts and Image Segmentation

0 意見
Image segmentation
- spatial space
  - region growing
  - watershed
- feature space
  - kmeans
- graph based

Min Cuts
NP-complete -> eigen problem to approximate

recursively two-way Ncut
- when eigen value still small
- when inter-group pixels are not very similar (still can be segmented)
simultaneous K-way cut
- iterative merge

[3/25] Nonlinear dimensionality reduction by locally linear embedding

0 意見
LLE

constrains to avoid many sols.
1. mean = 0
2. Y cross YT = I

IsoMap

1. still use k-nn
2. compute pairwise geodesic distance
3. MDS

[3/25] fisher face

0 意見

Fisher face (FLD) is different from PCA (principle component analysis), it is supervised.

So that data points will be projected to low dimensional space which can separate different type of data.