You haven't seen big data in action until you've seen Gartner analyst Doug Laney present 55 examples of big data case studies in 55 minutes. It's kind of like The Complete Works of Shakespeare, Laney joked at Gartner Symposium, though "less entertaining and hopefully more informative." (Well, maybe, for this tech crowd.) The presentation was, without question, a master class on the three Vs definition of big data: Data characterized by increasing variety, velocity and volume. It's a description, by the way, that Laney -- who also coined the term infonomics -- floated way back in 2001.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
The 55 examples are not intended to intimidate, but instruct. Laney told the audience not to feel overwhelmed, but to home in on the big data case studies that might improve business performance at their own companies: "Yes, I know you're in industry x, but there are tremendous ideas that come from other industries that you need to consider adapting and adopting for your own industry," he said.
Here are 10 of them:
1. Macy's Inc. and real-time pricing. The retailer adjusts pricing in near-real time for 73 million (!) items, based on demand and inventory, using technology from SAS Institute.
2. Tipp24 AG, a platform for placing bets on European lotteries, and prediction. The company usesKXEN software to analyze billions of transactions and hundreds of customer attributes, and to develop predictive models that target customers and personalize marketing messages on the fly. That led to a 90% decrease in the time it took to build predictive models. SAP is in the process of acquiring KXEN. "That's probably a great move by SAP to fill a predictive analytics gap they've long had," Laney said.
3. Wal-Mart Stores Inc. and search. The mega-retailer's latest search engine for Walmart.com includes semantic data. Polaris, a platform that was designed in-house, relies on text analysis, machine learning and even synonym mining to produce relevant search results. Wal-Mart says adding semantic search has improved online shoppers completing a purchase by 10% to 15%. "In Wal-Mart terms, that is billions of dollars," Laney said.
4.Fast food and video. This company (Laney wasn't giving up who) is training cameras on drive-through lanes to determine what to display on its digital menu board. When the lines are longer, the menu features products that can be served up quickly; when the lines are shorter, the menu features higher-margin items that take longer to prepare.
5. Morton's The Steakhouse and brand recognition. When a customer jokingly tweeted the Chicago-based steakhouse chain and requested that dinner be sent to the Newark airport, where he would be getting in late after a long day of work, Morton's became a player in a social media stunt heard 'round the Interwebs. The steakhouse saw the tweet, discovered he was a frequent customer (and frequent tweeter), pulled data on what he typically ordered, figured out which flight he was on, and then sent a tuxedo-clad delivery person to serve him his dinner. Sure, the whole thing was a publicity stunt (that went viral), but that's not the point. The question businesses should be asking themselves: "Is your company even capable of something like this?" Laney said.
6.PredPol Inc. and repurposing. The Los Angeles and Santa Cruz police departments, a team of educators and a company called PredPol have taken an algorithm used to predict earthquakes, tweaked it and started feeding it crime data. The software can predict where crimes are likely to occur down to 500 square feet. In LA, there's been a 33% reduction in burglaries and 21% reduction in violent crimes in areas where the software is being used.
7. Tesco PLC and performance efficiency: The supermarket chain collected 70 million refrigerator-related data points coming off its units and fed them into a dedicated data warehouse. Those data points were analyzed to keep better tabs on performance, gauge when the machines might need to be serviced and do more proactive maintenance to cut down on energy costs.
8.American Express Co. and business intelligence. Hindsight reporting and trailing indicators can only take a business so far, AmEx realized. "Traditional BI [business intelligence] hindsight-oriented reporting and trailing indicators aren't moving the needle on the business," Laney said. So AmEx started looking for indicators that could really predict loyalty and developed sophisticated predictive models to analyze historical transactions and 115 variables to forecast potential churn. The company believes it can now identify 24% of Australian accounts that will close within the next four months.
9. Express Scripts Holding Co. and product generation. Express Scripts, which processes pharmaceutical claims, realized that those who most need to take their medications were also those most likely to forget to take their medications. So they created a new product: Beeping medicine caps and automated phone calls reminding patients it's time to take the next dose.
10. InfinityProperty & Casualty Corp. and dark data. Laney defines dark data as underutilized information assets that have been collected for single purpose and then archived. But given the right circumstances, that data can be mined for other reasons. Infinity, for example, realized it had years of adjusters' reports that could be analyzed and correlated to instances of fraud. It built an algorithm out of that project and used the data to reap $12 million in subrogation recoveries.
Welcome to The Data Mill, a weekly column devoted to all things data. Heard something newsy (or gossipy)? Email me or find me on Twitter at @TT_Nicole.
USGS has a wealth of GIS-based models for the distribution of a wide range of climate and environmental variables that Griffin’s team is now combining with the results of polymer chain reaction–based assays for particular microorganisms, including anthrax, Bacillus species, and Naegleria fowleri, a highly lethal amoeba that consumes brain tissue. For the latter organism, low levels of copper and high levels of zinc appear to correlate to where cases of infection with the amoeba have been reported. These models were scheduled to be available on the USGS website for public use in late 2016.
In response to a question, Griffin said it is important to combine knowledge about microbial biology with the information gleaned from the model. For example, Griffin’s analysis identified strontium as an element present in soils where anthrax was found, and when an anthrax researcher questioned him about this, he was able to remind the researcher that strontium is critical to anthrax spore formation.
GIS AND VECTOR-BORNE DISEASES
By combining published information on a variety of climate and geographical data with outbreaks of various infectious disease and known locations of the vectors that transmit the infectious organism and by using a tool called similarity search, Attaway has been able to generate maps that relate environmental and climate conditions to the likelihood of future outbreaks (see Figure 4-4). A similarity search, which relies on a statistical application known as cosine similarity, makes it possible to identify the features or candidates that are most similar or dissimilar to specific features or attributes in much the same way that a consumer application such as Yelp makes recommendations based on a customer criteria.
ArcGIS is another tool that ESRI has developed for predictive analysis, Attaway said. This free tool provides the ability to look at suitable locations for outbreaks and offers other applications as well, such as threat detection, drug use, and urban planning, based on historical data. ArcGIS uses a process called pattern-of-life analysis that enables hypothesis testing and retesting over multiple iterations and produces predictive maps. For example, Attaway and his colleagues used ArcGIS to analyze temperature, precipitation, elevation, land cover, population density, and other variables available from public sources to identify locations suitable for year-round Aedes mosquito activity (see Figure 4-5). In response to a question, he acknowledged that while the analysis itself can be done in minutes, it depends on the availability of data collected over months and even years. ArcGIS’s strength, he said, is its ability to pull data together from a variety of sources, analyze it, and produce actionable insights.
MODELING THE SPREAD OF DISEASE AT SCALE
The challenge that Sadilek is attempting to address involves using artificial intelligence or machine learning in combination with online data to enable the