Skip to main content

Geolocalized Modeling for Dish Recognition

Abstract
Food-related photos have become increasingly popular, due to social networks, food recommendation and
dietary assessment systems. Reliable annotation is essential in those systems, but unconstrained automatic food
recognition is still not accurate enough. Most works focus on exploiting only the visual content while ignoring the
context. To address this limitation, in this paper we explore leveraging geolocation and external information about
restaurants to simplify the classification problem. We propose a framework incorporating discriminative classification
in geolocalized settings and introduce the concept of geolocalized models, which in our scenario are trained locally
at each restaurant location. In particular, we propose two strategies to implement this framework: geolocalized voting
and combinations of bundled classifiers. Both models show promising performance, and the latter is particularly
efficient and scalable. We collected a restaurant-oriented food dataset with food images, dish tags and restaurantlevel
information, such as the menu and geolocation. Experiments on this dataset show that exploiting geolocation
improves around 30% the recognition performance, and geolocalized models contribute with an additional 3~8%
absolute gain, while can be trained up to five times faster.



Food recognition
Previous works on dish recognition are mainly based on analyzing the visual appearance. Some works address
food recognition using conventional visual features trying to capture the global appearance of the food. Joutou
and Yanai [6] proposed an automatic food image recognition system based on multiple kernel learning (MKL),
which integrates several kinds of image features (e.g. color, texture, SIFT) to learn an optimal linear combination
of feature-specific kernels. Hoashi et al. [7] extended the system proposed in [6] with more image features and
food classes. Maruyama et al. [8] improved the recognition accuracy by incrementally updating the classifier based
on a Bayesian network. Zong et al. [2] proposed to exploit the structure of the food object which is represented as
the spatial distribution of the local textural structures and encoded using shape context. Kawano et al.[9] compute
Fisher vectors over HOG patches to develop a real-time mobile food recognition system. Recently, they extended
the system to 256 food categories [10].
Other works consider food as a certain combination of different components (ingredients). Yang et al. [1] proposed an American fast food recognition system by using pairwise local features, which effectively captures important shape characteristics and spatial relationships between food ingredients. Dietcam [3] analyzes a meal by taking several images (or a short video), estimating the volume of each food items and finally estimating the caloric
intake. The recognition accuracy is increased through modeling food geometric locations and a joint probability model. Zhang [17] proposed to classify plates of food to the correct cuisine using attribute-based classification,

Comments

Popular posts from this blog

Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search

Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search ABSTRACT: In this paper, With advances in geo-positioning technologies and geo-location services, there are a rapidly growing amount of spatiotextual objects collected in many applications such as location based services and social networks, in which an object is described by its spatial location and a set of keywords (terms). Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In the paper, we study two fundamental problems in the spatial keyword queries: top k spatial keyword search (TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio-textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch processing of sets...

A simple and reliable touch sensitive security system CODING

#include <REGX51.H> #include "lcd.c" #define MAX_DELAY() delay(65000) sbit Vibra_Sense=P3^1; sbit Buz=P1^0; void intro() {  lcd_init();  lcd_str("Touch Sensitive ",0x80);  lcd_str("Security System ",0xc0);  MAX_DELAY();MAX_DELAY();  lcd_clr();  }  void main()  { unsigned int i = 0, j= 0; intro();    while(1)    { lcd_str("Security Syst On",0x80); lcd_str("No Vibra Detectd",0xc0); Buz = 1; if(Vibra_Sense == 1) { while(Vibra_Sense == 1) delay(1000); } else { while(Vibra_Sense == 0) delay(1000); } Buz = 0; lcd_str("Vibraton Detectd",0xc0);delay(65000); while(1);    }  }

A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud

A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud Abstract                                                                                                                                                      ...