1
Yang Yue1,2,, Jia Chen1,2,, Bo Hu1,2,, Rong Xie3, Xiao-Qing Zuo4, Xing Xie5 State Key Lab of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University 2 Engineering Research Center for Smart Acquisition and Applications of Spatiotemporal Data, Ministry of Education, Wuhan University 3 International School of Software, Wuhan University 4Faculty of Land Resource Engineering, Kunming University of Science and Technology 5Microsoft Research Asia, Beijing yueyang@whu.edu.cn
ABSTRACT
We present our ongoing work on labeling personal characteristics using mobile phone trace data, POIs (Point of Interests) and real estate price data. POIs and real estate data are used to extract sematic features of the regions where mobile phone users actively involved. Referring to the regional features, a group of personal labels can be attached to the user.
2. PROPOSED METHODOLOGY
Categories and Subject Descriptors General Terms Keywords
H.3.3 [Information Search and Retrieval]: Retrieval models. J.4 [Social and Behavioral Sciences]: Sociology. Algorithms, Experimentation. Mobile Phone Data, Semantic Label, Trajectory Data Analysis.
The basic assumption of this study is that, where people live, work and hangs out are those regions that can meet their preferences. Therefore, the characteristics of those regions, to a great extent, also reflect the characteristics of the person. We first identify regions closely attached to a user, i.e., ROIs (Regions of interest), by spatially aggregating the mobile phone traces. Then, we extract the features of these ROIs by using data crawled from POI reviewing websites and real estate websites, together with map data. Last, the regional features are associated to the user as his/her personal labels. Figure 1 shows the label matching method. Here, an assumption is hold that that people work at day time, and stay home at night.
%
References: Figure 2. A Person’s Mobile phone Trace and ROIs Although this person’s trace spread over a wide area, he/she only has three important regions attached. It can be observed that R1 and R3 are the regions the personal spent most of the time, while R1 is associated with working-time and R3 is more related to off-work time. Thus, it is very possible that the person works at R1 and lives at R3. In R3, around 60% of the housing price is between 8,000-10,000RMB/m2 which is above the average price of the study area. Then, a lable possibly associated with the person is “Middle-high income”. As to R2, most of the trace points were generated in day time, both at working day and weekend (Figure 3a). We further examined the top 10 POI categroies in this regaion, as shown in Figure 3b. It can be referred that this area is higly related to building and decoration materials. Since the time pattern of R2 is very similar to R1, our algirithm lables it as “Workplace”. It is not very often that a person is assicated with more than one workplaces, but some people, such as a boss with two shops, do have such features. Although there may have other possiblities, for the time being, this is the preliminary result generated by our algorithm using existing dataset. Further work may necessary to refine or validate the result. In general, the labels generated for this user are: middle-high income, building material. [2] S. Isaacman, R. Becker, R. o. C´aceres, K. Stephen, M. Martonosi, J. Rowland, and A. Varshavsky, "Identifying Important Places in People 's Live from Cellular Network Data," Lecture Notes in Computer Science, Vol. 6696, pp. 133-151, 2011. C. Licoppe, D. Diminescu, Z. Smoreda, And C. Ziemlicki, "Using Mobile Phone Geolocalisation for 'Socio-geographical ' Analysis of Co-ordination, Urban Mobilities, and Social Integration Patterns," Tijdschrift voor economische en sociale geografie, Vol. 99, pp. 584-601, 2008. N. Eagle, A. Pentland, and D. Lazer, "Inferring friendship network structure by using mobile phone data," Proceedings of the National Academy of Sciences of the United States of America, Vol. 106, pp. 15274-15278, Sep 8 2009. M. A. Bayir, M. Demirbas, and A. Cosar, "A Web-Based Personalized Mobility Service for Smartphone Applications," Computer Journal, Vol. 54, pp. 800-814, May 2011. M. A. Bayir, M. Demirbas, and N. Eagle, "Discovering SpatioTemporal Mobility Profiles of Cellphone Users," 2009 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks & Workshops, pp. 119-127, 2009. D. Birant and A. Kut, "ST-DBSCAN: An algorithm for clustering spatial-temporal data," Data & Knowledge Engineering, Vol. 60, pp. 208-221, 2007 [3] [4] [5] [6]