Twitter Sensors: Detecting the Traffic Jam in Washington Caused by… John Oliver

In the first week of June, 1 in 5 Tweets about traffic, delays and congestion by people around the Washington Beltway were caused by John Oliver’s segment on Net Neutrality.

Here at Savi we are exploring a wide range of sensors to obtain useful insights that can used to make work and routine activities faster, more efficient and less risky. One of our Alpha Tests is examining use of “arrays” of high-targeted Twitter sensors to detect early indications of traffic congestion, accidents and other sources of delays. Specifically we are training our system how to use Twitter is a good traffic sensor (by good, in “data science speak” we are determining whether we can train a model for traffic detection that good precision and recall). To do this, we setup a test bed around the nation’s second-worst commuter corridor: the Washington DC Beltway (our own backyard at Savi).

Earlier this month our Twitter “Sensor Array” picked up an interesting surge in highly localized tweets about traffic-related congestion and delays. This was not an expected “bad commuter-day”-like surge. The number of topic- and geographically-related tweets seen on June 4th was more than double the expected number for a Tuesday in June around the Beltway; the number seen during lunchtime was almost 5x normal.

So what was the cause? Before answering, it is worth taking a step back.

The folks at Twitter have done a wonderful job at not only allowing you to fetch tweets based on topics, hash tags and geographies. They have also added some great machine learning-driven processing to screen out likely spammers and suspect accounts. Nevertheless Twitter data, like all sensor data, is messy. It is common to see Tweets with words spelled wrong, words used out of context, or simply nonsensical Tweets. In addition, people frequently repeat the same tweets throughout the day (a tactic to raise social media exposure) and do lots of other things that you must train the machine to account for.

That’s why we use a “Big Data” Lambda Architecture model to process all of our streaming sensor data. Not only do we apply rules in real-time (a.k.a. Complex Event Processing) to detect patterns as they happen; we also keep a permanent copy of all raw data that we can explore to discover new patterns and improve our machine learning models).

That is exactly what we did as soon as we detected the surge. Here is what we found: the cause of the traffic- and congestion-related Twitter surge around the Beltway was… John Oliver:

  1. At 11pm EDT on June 1 on the “Last Week Tonight”, John Oliver had an interesting 13-minute segment on Net Neutrality. In this segment he encouraged people to visit the FCC website and comment on this topic.
  2. Seventeen hours later, the FCC tweeted that “[they were] experiencing technical difficulties with [their] comment system due to heavy traffic.” They tweeted a similar message 74-minutes later.
  3. This triggered a wave of re-tweets and comments about the outage in many places. Interestingly this wave was delayed in the Beltway. It surged the next day, just before lunchtime in DC, continuing throughout the afternoon. The two spikes were at lunchtime and just after work (evidently, people are not re-tweeting while working).
  4. By 4am on Wednesday the surge was over. People around the Beltway were back to their normal tweeting about traffic, construction, delays, lights, outages and other items confounding their commute.

Of course, as soon as we saw the new pattern, we adjusted our model to account for this pattern. However, we thought it would be interesting to show in a simple graph how much “traffic on traffic, delays and congestion” Mr. Oliver induced in the geography around the Beltway for a 36-hour period. Over the first week of June, one out of every five Tweets about traffic, delays and congestion by people around the Beltway were not about commuter traffic, but instead around FCC website traffic caused by John Oliver:

Tweets from people geographically Tweeting around the Washington Beltway on traffic, congestion, delays and related frustration for first week of June. (Click to enlarge.)
Tweets from people geographically Tweeting around the Washington Beltway on traffic, congestion, delays and related frustration for first week of June. (Click to enlarge.)

Obviously, a simple count of tweets is a gross measure. To really use Twitter as a sensor, one needs to factor in many other variables: use text vs. hash-tags, tweets vs. mentions and re-tweets, the software client used to send the tweet (e.g., HootSuite is less likely to be a good source for accurate commuter traffic data); the number of followers the tweeter has (not a simple linear weighting) and much more. However, the simple count is simple first-order visualization. It also makes interesting “water-cooler conversation.”


Big Data Analytics – Test Before You Invest

Big data remains one of the hottest topics on the planet. Barely a week goes by where I don’t get some kind of offer related to big data—download a white paper, go to this conference, attend that webinar, look at this new product offering. It’s not just talk either–I’ve spoken with business leaders who are investing hundreds of thousands or millions of dollars on big data initiatives. Having always been a data guy, there’s clearly a lot of goodness in understanding and making sense of your data. But determining how to get value out of big data remains one of the top challenges and is, by far, where I spend the majority of my time.

The big data landscape continues to get more and more crowded. Big data solutions like Teradata, Cloudera and Netezza can certainly make it easier to store and process all of that data and analytics tools like SPSS, R and Tableau can help with analyzing it, but the question remains:

What business problem(s) are you trying to solve with your big data initiative?

As technology has gotten more powerful, there is a tendency to push the envelope and tackle big, hairy audacious challenges or maybe do big data for big data’s sake. Solving challenges is a good thing, like completing the NY Times crossword puzzle, but before you jump on the bandwagon and invest in a big data project, take a step back and do two things.

First, like Covey says, “begin with the end in mind.” Answer some very simple questions:

  • Is this a revenue generating or cost savings initiative?
  • Am I adequately staffed to support the effort for the long term?
  • What is my ideal “time to return” on this investment?

This will help you figure out the business case for this investment—or better yet, apply a specific business context to your big data initiative. So before writing a check, make sure you know the problem you’re trying to solve, double check that a big data project is the best way to solve it, and make sure the cost is worth the reward. As Tony Carter wrote, never solve a non-problem.

Once you’re sure you have built the business case and know the questions your big data initiative will answer, eat the elephant one bite at a time. Run a small pilot program on a sub-set of your data—with a specific business context and make sure that the big data analysis was the best way to solve “the” question you’re trying to answer. As you’re confident and successful with “phase one,” then go ahead and expand the investment. Otherwise, if you just jump into the deep end of the big data swimming pool without making sure you can swim, you and your career may be SOL—all thanks to big data.


Reusable Containers: The Under-Appreciated Champions of the Supply Chain

How often do you think about reusable containers?  My guess is not very often, right?  Reusable containers, however, are essential for enabling all aspects of the supply chain.  Coming in all shapes and sizes, containers are essential to many industries–transportation, oil & gas, heavy equipment manufacturing, waste management & recycling to name a few.  Basically, reusable containers are unsung heroes in more industries than you can imagine.

While reusable containers may not keep executives up at night, having end-to-end visibility in their supply chain sure does.  And this is one place where the plain old reusable container can make a huge difference—and allow executives to worry about something else.  The most effective way to gain that much desired end-to-end visibility is by putting tracking devices on reusable containers.  That way, supply chain leaders can track the movement of the shipments from suppliers to the manufacturing facility, within the manufacturing facility itself, and then as finished goods to the customer via the distribution network.  In addition, tracking containers not only provides supply chain visibility, but also helps ensure the right container is available at the right time.

If these assets—and reusable containers are important assets–are uniquely identified and tracked, a company can see significant benefits including streamlined operations, reduced theft and reduced manual processes.  And if your company owns or leases containers, there is the additional benefit of increased revenue due to the increase of asset utilization.  If you want to see how Savi can help with your reusable container needs, click here to use our uber-quick ROI calculator!

Firefighting: When ‘Just in Case’ is More than a Phrase

Last weekend, I spent Saturday afternoon at a picnic sponsored by the small historic town that I’ve lived in for nearly a decade.  One thing (of many) that I love about the town is the local fire station that’s less than a mile from my home, and how involved the firefighters are with the town (many attend the picnic, participate in our annual parade, etc. – basically just good people being good citizens).  I’ve always felt safe knowing that brave firefighters and skilled EMTs are around the corner just in case.

‘Just in case’ is a scary phrase because it can result in many things, but without a doubt it implies a sub-optimal outcome.  These can range from the relatively banal, “I will pick up extra candles and blankets just in case the storm knocks the power out” to the frightening, “I better learn judo just in case there’s a Dexter-like psychopath [without Harry’s code, of course] on the loose.”

That got me wondering, what “just in case” scenarios did the residents of Boulder Colorado think about?  Could anyone predict the multi-faceted challenges of droughts, wildfires and floods in the same season?  And what did it take for their brave firefighters, EMTs and other first responders to battle these overlapping disasters?  It must have been a Herculean effort to move equipment, supplies, communications gear, etc. with basically no warning—and with lives on the line.

Unfortunately, we don’t know what the next ‘just in case’ disaster will be.  The future is unknown, there’s no crystal ball and the Magic 8-ball doesn’t work.  However, being prepared ahead of time to battle these tragedies is an essential to step to mitigate the devastation.  And that’s one of the reasons I’m so excited to be a part of Savi.  We recently released a solution that helps emergency workers and other first responders with a “Respond & Rescue” logistics solution; this solution enables these brave men and women to locate & track critical assets anywhere in real-time—even in harsh, remote and/or devastated areas.  Savi’s solution not only sends alerts when assets deviate from their planned route / location or have been tampered with, but also the analytic capabilities predict how long it will take to move and set-up key assets, as well as needed quantities and best placement.  Now, that Savi solution makes the unknown “just in case” a bit less scary.

Not So Different After All: What I Learned at Savi’s Public Sector Open House

Even though I was born and raised in Washington, DC and spent most of my adult life in the area, my entire career has been in the commercial sector not the public sector. The public sector seemed like an entirely different universe with its own unique rules and regulations–worlds away from the tech-heavy career path I’m familiar with.  So when Savi recently hosted an Open House specifically for our public sector customers at our Alexandria, VA headquarters, it was quite eye opening.  What I learned, in a nutshell, is that the commercial and public sectors aren’t so different after all.

Here are two key examples:

  • They are agile.  Yes, agile.  With civilian employment at nearly 3 million people and the US Armed Forces adding close to another 1.5 million, one may not think the public sector can turn on a dime, but they can.  The public sector craves, seeks out and makes sense of what the commercial sector would call “market forces.”  But it’s a lot more serious than the latest business book and a “blue ocean” strategy—people’s lives are on the line.  The public sector adjusts and adapts as the world changes, protecting the safety of our country and its citizens, whether they are within our borders or located half way across the world.  That’s not only agility–it’s agility under tremendous pressure.
  • They embrace technology.  Just like commercial organizations, the public sector recognizes the value of technology.  They use technology to increase efficiency, eliminate manual processes, and store & analyze data to improve decision-making and results for the people they serve.  And they rely on technology—imagine if the Census Bureau had to count and analyze 100 million surveys by hand!  In fact, the 1890 Census was the first ever to use electronic tabulating machines. And it’s not just the statistical wonks.  Savi customers were some of the first—not just in the public sector, but in any sector—to recognize the value and uses of wireless technology such as GPS, RFID, etc.  So the public sector often is not only an early adopter, but the first adopter, of technology that re-shapes our world.

So it turns out that commercial and public sectors are not worlds apart after all….and both benefit from Savi’s solutions whether it’s the US Army leveraging our technology for sense & respond logistics, or the commercial sector seeking supply chain predictability & intelligence.  And that’s the value of knowing, brought to you by Savi.