The data scientists behind the scenes and how they put a spotlight on dark data

Over the past 18 months I have learned a lot about analytics and big data, especially applied to the workforce. I spend a fair amount of my time speaking to customers, analysts and the media and besides the most common question of “do you have any examples you can share” I get asked about the people involved and process of developing these applications. In the spirit of “people are the most important resource a company has” I want to showcase that side of big data.

For timekeeping and scheduling, dark data…( to save you a quick trip to your favorite search engine, I mean data that is collected but not typically used, yet is still required) ….in this case the audit trails of any change made to a timecard or schedule.

In Kronos there are sixteen different types of edits you can make to a timecard or schedule. Each of these edits represents tiny trades of time and money between an employee and the company. Individually most are inconsequential, but in aggregate they represent tens or hundreds of millions of dollars. By and large most of these changes are transactions that everyone agrees to and are necessary… The employee forgot to clock in so the supervisor adds in an “in punch”. Or an employee calls in sick and the supervisor changes a paycode from regular to sick in the schedule.

Occasionally however there are situations where the changes are indicative of an issue. For example, a supervisor changes a couple of minutes around during the week on an employee’s time card and eliminates premium pay. Or a supervisor changes a schedule after the fact to represent that an employee only worked the hours they were scheduled.

These small changes are usually lost in the millions of annual transactions that occur throughout the year. And because they are so small they are usually missed by most reports and audit teams. Only when the employee affected has the courage to speak up does a company become aware of it. By this time the consequences for all involved are significant; from degraded morale on the part of the employee to unnecessary cost in terms of productivity, turnover and financial impact for the company.

As the economy improves and companies feel the pain of turnover and lost performance when employee engagement sags, we have been engaged by companies to understand how they can identify these situations. The companies know the answer is in the data because when someone files a grievance and points out the specific situation and dates, the HR department can immediately see what happened in the transactions.

The challenge is seeing these changes sooner; especially before someone is so frustrated they file a grievance or the behavior becomes obvious to all. This is where one of our data scientists who has a PhD in computer science realized that this is a very similar challenge to what retailers face when they are trying to understand what the millions of customer clicks represent on a website. The customers aren’t telling them why they are clicking the way they do and only a fraction of the clicks result in an order.

So the data scientist applied the same machine learning techniques on timekeeping data that retailers use when they analyze their web server logs. The result of his work however was very difficult to interpret unless you understood machine learning and clustering techniques. To simplify this we had one of our visualization experts re-imagine the output in a way that a lay person could understand. Her interpretation was amazing in its simplicity!

Secondly, the data scientist had created a very flexible tool. The first prototype had a number of tuning parameters requiring the user to take output from past results and enter it in to help weight certain parameters for future analysis. We recognized that aside from a data scientist, we couldn’t expect a typical business user to be able to perform this tuning. So we focused what the tool could do and eliminated the tuning parameters.

clustering dashboard

An example of the machine learning dashboard in Workforce Auditor

We were very nervous and excited about analyzing our first data set (we went in without knowing anything about the customer or their practices to ensure we didn’t bias the analysis). When we researched the results, there it was…we had found an issue that was previously unknown to that company. We tried it a second, third and fourth time. Each time we found something important to the customer that they suspected but couldn’t prove or they were completely unaware of. These small changes were indicative of million dollar +  issues that were looming for these companies but had now been avoided….very exciting stuff.

We found supervisors gaming schedules to improve their own bonuses (the company since tweaked the rules of the bonus). We found a store manager working extremely hard to rebuild her schedule each week because the forecast and automated schedule she received was off (the company immediately re-tuned the forecast for her store). There were many more examples and we realized that we had developed quite a versatile tool. Its power is that it can evaluate the actions of thousands of employees and narrow it down to just a handful of situations that require further investigation in a matter of minutes.

With so many positive results we fast tracked the technology. It’s now available as Workforce Auditor and is included with our Workforce Analytics platform.

Take a ways from this experience?

1) Skills and experience really count in developing big data applications, no one is going from “excel guru” to building a machine learning application overnight and it takes multiple people to get it right

2) involving (internal or external) customers and their data is essential; no one could ever build this without deep domain knowledge and many different data sets to trial

3) By focusing on the business problem rather than the technology we created something that was streamlined and easy to use rather than a feature laden product showcasing the power of machine learning.

When I have a little more time to write, I want to share how the newest member of our team used scheduling data and a network map to uncover undisclosed relationships in a company and what it was costing them….stay tuned!

Variance reports apply to Timekeeping and Payroll too

As I introduced in my book Lean Labor, the Perfect Paycheck is a concept borrowed from the manufacturing term “Perfect Order”. A Perfect Paycheck one that is accurate, delivered on time, and at the right price.

Delivering a Perfect Paycheck is a good first step in achieving Lean Labor. That’s because not only does it deliver quick reductions in waste, but more importantly it sets up an accurate baseline of data to make better labor related decisions during the day.
But let’s face it, when it comes to spending more time building jet engines or making sure an employee’s time card is accurate, the jet engine gets the attention. This is the challenge that people in IT and Payroll present to me when their Perfect Paycheck efforts aren’t broadly adopted.
While there can be a variety of reasons for Lean or any type of project to stall, one common theme is that the participants don’t understand the issues and the benefits. Effort and reward may be experienced by different people or departments. Or when it comes to timekeeping and payroll, often it seems that the administrative pain isn’t worth the gain. A couple of minutes here and there are much less important than ensuring an order is delivered on time.
When it is Operations that doesn’t seem to be adopting the changes required to achieve the Perfect Paycheck I ask “What are they being asked to do and what are the consequences if they don’t?”. While the responses are varied ranging from the Payroll department will fix the mistakes to people aren’t paid accurately all the time, it typically boils down to one situation. The Operations group doesn’t understand the cost of not making change and feels that they have other tasks to do that have more impact on the company.
One of the simplest techniques to remedy this is what I call the “Paycheck Variance Report.” Operations is very familiar with tracking and controlling variances. A variance is an unexpected change in cost or time needed to complete an operation. Variances can originate from a wide variety of places. It might be that the price of the materials shot up recently or someone was on overtime when they worked that week and it cost more than expected. It could also be a positive variance where someone was able to accomplish something faster than expected and it cost less. Operations and Finance reviews these variances closely and while they know they can never get every variance under control, they do make a continuous effort to do so. One area I have never seen tracked in a variance report is the timekeeping and payroll process. As you might imagine, since these processes are not tracked, they are not improved unless the variance is too large to miss. In order to make your Paycheck Variance Report simple to understand within your company, get a sample of a variance report that is written today and copy the format.
I suggested to one payroll project manager who was suffering from project crawl that they start tracking all the issues and recording how much time and money it was costing Operations and the company as a whole. He could then deliver this Paycheck Variance Report to the Operations Manager on a monthly basis. He would then be able to look at this report like he would his other variance reports. He may not act on every one, but you can be sure that if any of the issues ranked higher in terms of cost or time than a more traditional variance, he would be on it in a flash. One thing about Operations, when it comes to saving time or money, I have never seen them biased about what they need to do to execute on the project. But they must be convinced that what they change is going to be worth the effort.