08/07/2016 Toby Dayton

LinkUp Data Versus Conference Board’s HWOL Data Series & Help Wanted Analytics/CEB Data

A friend recently sent me a graph of U.S. labor demand published by an asset management firm and asked me if the trends this year matched what we were seeing in LinkUp’s data. The firm’s data appears to be derived from or at least closely resembles data from the Conference Board which sources its data from Wanted Analytics/CEB, and my reply was a fairly blunt no. While there are a few nuances I’ll get to in a minute, LinkUp’s data paints a pretty starkly contrasting (and I’d argue far more accurate) view of the U.S. labor market than Wanted Analytics/CEB’s data.

Below is The Conference Board’s Help Wanted Online (HWOL) data series. In characterizing the first half of 2016, Gad Levanon, Chief Economist, North America, at The Conference Board stated, “The first half of 2016 has shown a substantial drop in the level of online advertised vacancies.”


HelpWanted Data


While the HWOL series not surprisingly captures the recovery in the labor market since early 2009 (it was pretty hard to miss), the series shows a drop of 11% in online job listings since January. Over the same period of time, job openings in LinkUp’s search engine, which only indexes jobs directly from 30,000 corporate websites, rose 6% from 3.22 million openings to 3.35 million.


LinkUp Job Openings Since January 2014


The nuances mentioned previously arise from the fact that we are constantly adding new companies to our index – roughly 200 new companies per month. This obviously creates an upward bias to our data, although that bias is diminishing over time as we continue to migrate further into the long-tail of smaller and smaller employers in the U.S. (new companies added to our index in 2016 have averaged 200 job openings as compared to 290 in 2015 and 380 in 2014).

To account for the perpetual addition of new companies, albeit at lower average job counts as we add smaller and smaller businesses, we track job openings per company as one measure of labor demand across the country. There as well, our data points to continued growth in aggregate labor demand, with average job openings per company in LinkUp’s search engine rising 11% from 205 to 228.


Job Openings per company


So how might we explain the discrepancy between LinkUp’s data and the Conference Board’s HWOL data series? The first thing I’d point out, before answering that question, is that average monthly job gains in the U.S. have declined 16% this year from last, as measured by the Bureau of Labor Statistics’ monthly NFP report, which would appear to correlate with the Conference Board’s and Wanted Analytics/CEB’s data.


2016 NFP through July


But in a full employment environment, which we have argued vociferously since May is precisely how one should regard the U.S. labor market (see hereherehere and here), the decline in average monthly net job gains this year should not be surprising in the least as companies find it harder and harder to fill job openings in a tightening labor market. And while there is an obvious correlation between job openings and job growth, the correlation becomes far more nuanced in a full employment environment and requires more sophisticated analysis and robust modeling to identify and leverage for specific use cases, points I’ll elaborate on later in this post. It also requires accurate data, which brings me back to the discrepancy between linkUp’s data and that of the Conference Board’s HWOL series.

The Conference Board sources its job openings data from Wanted Analytics which was acquired by CEB last fall. Wanted Analytics/CEB aggregates job listings from thousands of sites on the web, including some company websites but also job boards, newspaper sites, industry associations, and pretty much anywhere else one might find a job opening online. And while it would seem as if this would lead to a large and insightful dataset, it actually leads to the exact opposite – a gigantic but wildly bloated, highly polluted, and extremely ‘noisy’ dataset that generates few if any insights into U.S. labor demand. The bloat and noise arise from two critical aspects of Wanted Analytics/CEB’s data sources.

The first is that Wanted Analytics/CEB sources the majority of its job listings from job boards, most all of which are plagued by what we call ‘job board pollution’ – things like work-at-home scams and other fraudulent listings, identity theft posts, phishing scams, resume hunters, and expired listings. The second is the proliferation of job syndication over the past 3 years or so, arguably the most significant trend in online job listings since about 2013.

Nearly every job site on the web today takes a feed or multiple feeds of job listings from other job sites, and in turn, syndicates their own jobs to other sites. On the feed intake side, job boards use 3rd-party job content as ‘backfill’ to both generate incremental revenue and supplement their own listings in order to increase the depth and breadth of their job openings, even though the backfill not only contains polluted job listings but also results in duplicate listings. On the feed output side, job sites syndicate jobs to other sites in order to generate candidate flow for their employer advertisers which results in even more duplication downstream.

Job syndication in the online job space has become essentially ubiquitous over the past few years, with the result being that most job sites today have become gigantic cesspools of polluted and duplicative job listings. And because Wanted Analytics/CEB aggregates listings from nearly every job site on the web, they have accumulated a dataset not only plagued by toxic levels of job board pollution, but also rampant duplication. I’d speculate that the decline in job openings in the Conference Board’s HWOL data series in 2016 is the result of the industry’s efforts over the past 6 months to address the duplication issue.

But while most large sites today are taking long-overdue steps to reduce duplication on their own sites, Wanted Analytics/CEB will never eliminate duplication at the aggregate level as long as they accumulate listings from multiple sites due to the simple fact that employers typically advertise job openings on multiple sites. They will also never eliminate job board pollution as long as they aggregate jobs from job boards that accept shitty job listings (which nearly all still do).

The only way to eliminate job board pollution and duplication is to index jobs directly from employer websites, and that is precisely what LinkUp does – our dataset contains 3.4 million jobs from 30,000 employer websites. As a result, we have completely eliminated job board pollution and duplicate listings from our data, and because our index is updated daily, we have also eliminated expired listings because the job openings are always current.

And that brings us back, full-circle, to the discrepancies between our data and the Conference Board’s HWOL data series that relies on Wanted Analytics/CEB’s data. Not only does LinkUp possess a higher-quality, ‘cleaner’ dataset that delivers stronger, more accurate signals as to the true nature of labor demand in the U.S., but our forecasting model is based on a paired-month methodology to account for the addition of new companies to LinkUp’s dataset between months. And while using an alethiometer can be difficult at times, particularly in certain environments such as today’s where Chaos Syndrome is running rampant, it is that combination of better, more predictive data and a more sophisticated forecasting model that results in more accurate NFP forecasts like the one we made last Thursday for July’s jobs report.




Leave a Reply

Your email address will not be published. Required fields are marked *