skip to main content
10.1145/3132847.3132865acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Latency Reduction via Decision Tree Based Query Construction

Published: 06 November 2017 Publication History

Abstract

LinkedIn as a professional network serves the career needs of 450 Million plus members. The task of job recommendation system is to nd the suitable job among a corpus of several million jobs and serve this in real time under tight latency constraints. Job search involves nding suitable job listings given a user, query and context. Typical scoring function for both search and recommendations involves evaluating a function that matches various elds in the job description with various elds in the member pro le. This in turn translates to evaluating a function with several thousands of features to get the right ranking. In recommendations, evaluating all the jobs in the corpus for all members is not possible given the latency constraints. On the other hand, reducing the candidate set could potentially involve loss of relevant jobs. We present a way to model the underlying complex ranking function via decision trees. The branches within the decision trees are query clauses and hence the decision trees can be mapped on to real time queries. We developed an o ine framework which evaluates the quality of the decision tree with respect to latency and recall. We tested the approach on job search and recommendations on LinkedIn and A/B tests show signi cant improvements in member engagement and latency. Our techniques helped reduce job search latency by over 67% and our recommendations latency by over 55%. Our techniques show 3.5% improvement in applications from job recommendations primarily due to reduced timeouts from upstream services. As of writing the approach powers all of job search and recommendations on LinkedIn.

References

[1]
D. Agarwal, B.-C. Chen, Q. He, Z. Hua, G. Lebanon, Y. Ma, P. Shivaswamy, H.-P. Tseng, J. Yang, and L. Zhang. Personalizing linkedin feed. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1651--1660. ACM, 2015.
[2]
D. Agarwal and M. Gurevich. Fast top-k retrieval for model based recommendation. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 483--492. ACM, 2012.
[3]
V. N. Anh, O. de Kretser, and A. Moffat. Vector-space ranking with effective early termination. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 35--42. ACM, 2001.
[4]
Y. Aphinyanaphongs and C. Aliferis. Learning boolean queries for article quality filtering. Medinfo, 11(1):263--267, 2004.
[5]
P. O. Ashaolu. Query understanding: applying machine learning algorithms for named entity recognition. Master's thesis, Universitat Politècnica de Catalunya, 2014.
[6]
F. Borisyuk, K. Kenthapadi, D. Stein, and B. Zhao. Casmos: A framework for learning candidate selection models over structured queries and documents. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 441--450. ACM, 2016.
[7]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine, 1998. In Proceedings of the Seventh World Wide Web Conference, 2007.
[8]
A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Zien. Efficient query evaluation using a two-level retrieval process. In Proceedings of the twelfth international conference on Information and knowledge management, pages 426--434. ACM, 2003.
[9]
P. Covington, J. Adams, and E. Sargin. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, pages 191--198. ACM, 2016.
[10]
S. Goel, J. Langford, and A. L. Strehl. Predictive indexing for fast search. In Advances in neural information processing systems, pages 505--512, 2009.
[11]
J. Li, D. Arya, V. Ha-Thuc, and S. Sinha. How to get them a dream job?
[12]
T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009.
[13]
M. McCandless, E. Hatcher, and O. Gospodnetic. Lucene in Action: Covers Apache Lucene 3.0. Manning Publications Co., 2010.
[14]
X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen, et al. Mllib: Machine learning in apache spark. Journal of Machine Learning Research, 17(34):1--7, 2016.
[15]
M. Persin, J. Zobel, and R. Sacks-Davis. Filtered document retrieval with frequency-sorted indexes. JASIS, 47(10):749--764, 1996.
[16]
J. R. Quinlan. Induction of decision trees. Machine learning, 1(1):81--106, 1986.
[17]
S. Sankar. Did you mean "galene",https://engineering.linkedin.com/search/didyou- mean-galene. 2014.
[18]
T. White. Hadoop: The definitive guide. "O'Reilly Media, Inc.", 2012.
[19]
F. Xia, T.-Y. Liu, J. Wang, W. Zhang, and H. Li. Listwise approach to learning to rank: theory and algorithm. In Proceedings of the 25th international conference on Machine learning, pages 1192--1199. ACM, 2008.
[20]
Y. Xu, N. Chen, A. Fernandez, O. Sinno, and A. Bhasin. From infrastructure to culture: A/b testing challenges in large scale social networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2227--2236. ACM, 2015.
[21]
H. Yan, S. Shi, F. Zhang, T. Suel, and J.-R. Wen. Efficient term proximity search with term-pair indexes. In Proceedings of the 19th ACM international conference on Information and knowledge management, pages 1229--1238. ACM, 2010.
[22]
F. Zhang, S. Shi, H. Yan, and J.-R. Wen. Revisiting globally sorted indexes for efficient document retrieval. In Proceedings of the third ACM international conference on Web search and data mining, pages 371--380. ACM, 2010.
[23]
X. Zhang, D. Agarwal, B.-C. Chen, and P. Ogilvie. Building recommender systems using photon ml. In Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.
[24]
X. Zhang, Y. Zhou, Y. Ma, B.-C. Chen, L. Zhang, and D. Agarwal. Glmix: Generalized linear mixed models for large-scale response prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 363--372. ACM, 2016.

Cited By

View all
  • (2020)Ranking User Attributes for Fast Candidate Selection in Recommendation SystemsProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412742(2869-2876)Online publication date: 19-Oct-2020
  • (2018)Large Scale Search Engine Marketing (SEM) at AirbnbThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210207(1357-1358)Online publication date: 27-Jun-2018
  • (2017)Personalized Job Recommendation System at LinkedInProceedings of the Eleventh ACM Conference on Recommender Systems10.1145/3109859.3109921(346-347)Online publication date: 27-Aug-2017

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. information retrieval
  2. personalized search
  3. recommender systems

Qualifiers

  • Research-article

Conference

CIKM '17
Sponsor:

Acceptance Rates

CIKM '17 Paper Acceptance Rate 171 of 855 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Ranking User Attributes for Fast Candidate Selection in Recommendation SystemsProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412742(2869-2876)Online publication date: 19-Oct-2020
  • (2018)Large Scale Search Engine Marketing (SEM) at AirbnbThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210207(1357-1358)Online publication date: 27-Jun-2018
  • (2017)Personalized Job Recommendation System at LinkedInProceedings of the Eleventh ACM Conference on Recommender Systems10.1145/3109859.3109921(346-347)Online publication date: 27-Aug-2017

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media