Machine learning tools for fairness, at scale

Justice (source: Pixabay)Check out the machine learning sessions at the Strata Data Conference in London, May 21-24, 2018. Hurry—best price ends February 23. The problem of fairness comes up in any discussion of data ethics. We’ve seen analyses of products like COMPASS, we’ve seen the maps that show where Amazon first offered same-day delivery, and we’ve seen how job listings shown to women are skewed toward lower-paying jobs. We also know that “fair” is a difficult concept for any number of reasons, not the least of which is the data used to train machine learning models. Kate Crawford’s recent NIPS keynote, The Trouble with Bias, is an excellent introduction to the problem. Fairness is almost always future oriented and aspirational: we want to be fair, we want to build algorithms that are fair. But the data we train with is,…
Original Post: Machine learning tools for fairness, at scale

Put machine learning to work in the real world

Busy city street (source: Pxhere.com)Check out the “Data Science and Machine Learning” sessions at the Strata Data Conference in San Jose, March 5-8, 2018. Hurry—best price ends January 19. We’re in an empirical era of machine learning. Companies are now building platforms that facilitate experimentation and collaboration. At our upcoming Strata Data Conference in San Jose, we have many tutorials and sessions on “Data Science and Machine Learning” (including two days of sessions on enterprise applications of deep learning), and “Data Engineering & Architecture” (including sessions on streaming/real-time from several open source communities). If you want to understand how companies are using big data and machine learning to reinvigorate their businesses, there are many case studies on the schedule geared toward hands-on technologists, and sessions aimed at managers and executives. Putting data and machine learning technologies to work Over the…
Original Post: Put machine learning to work in the real world

Square off: Machine learning libraries

Main reading room at the Library of Congress (source: Library of Congress on Flickr)Check out the session, “The Journey of Machine Learning Platform Adoption in Enterprise,” at Strata London, May 21-24, 2018. Hurry—best price ends February 23. Choosing a machine learning (ML) library to solve predictive use cases is easier said than done. There are many to choose from, and each have their own niche and benefits that are good for specific use cases. Even for someone with decent experience in ML and data science, it can be an ordeal to vet all the varied solutions. Where do you start? At Salesforce Einstein, we have to constantly research the market to stay on top of it. Here are some observations on the top five characteristics of ML libraries that developers should consider when deciding what library to use: 1. Programming…
Original Post: Square off: Machine learning libraries

We need to build machine learning tools to augment machine learning engineers

Crowd (source: Pixabay)Check out the machine learning sessions at the Strata Data Conference in London, May 21-24, 2018. Hurry—best price ends February 23. In this post, I share slides and notes from a talk I gave in December 2017 at the Strata Data Conference in Singapore offering suggestions to companies that are actively deploying products infused with machine learning capabilities. Over the past few years, the data community has focused on infrastructure and platforms for data collection, including robust pipelines and highly scalable storage systems for analytics. According to a recent LinkedIn report, the top two emerging jobs are “machine learning engineer” and “data scientist.” Companies are starting to staff to put their data infrastructures to work, and machine learning is going become more prevalent in the years to come. Figure 1. Slide by Ben Lorica. As more companies start…
Original Post: We need to build machine learning tools to augment machine learning engineers

Bringing AI into the enterprise

Blythe House preparing totals for daily balance, 1930s (source: Post Office Savings Bank (UK) on Wikimedia Commons)Check out Kris Hammond’s tutorial, “AI in the Enterprise,” at our 2018 AI Conference in New York City or Beijing. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode of the Data Show, I spoke with Kristian Hammond, chief scientist of Narrative Science and professor of EECS at Northwestern University. He has been at the forefront of helping companies understand the power, limitations, and disruptive potential of AI technologies and tools. In a previous post on machine learning, I listed types of uses cases (a taxonomy) for machine learning that could just as well apply to enterprise applications of AI. But how…
Original Post: Bringing AI into the enterprise

8 fintech trends on our radar for 2018

Financial charts (source: TeroVesalainen via Pixabay)Check out the session “AI in personal finance: More than just chatbots” at the Artificial Intelligence Conference in New York, April 29-May 2, 2018. Hurry—best price ends February 2. 2017 saw big changes, a lot of investment, and some regulatory challenges in fintech. What will 2018 bring? Here’s what we’ll be watching in the coming year. 1. AI will be implemented across the stack AI is sweeping across all industry sectors, including financial services. AI touches customer interactions (voice services like Siri and dialog systems), fraud detection, trading, and risk management (machine learning), and is being used to automate many back-office tasks (robotic process automation). AI technologies are also giving rise to new fintech startups that use techniques like computer vision to unlock new datasets (e.g., aerial images). 2. New products will make advanced analytics easier Talk to…
Original Post: 8 fintech trends on our radar for 2018

What lies ahead for data in 2018

Traffic lights (source: jonbonsilver via Pixabay)See Ben Lorica’s video “Trends in AI, Data Science, and Big Data” on Safari for a recap of research initiatives and movements in 2017. Here’s what we expect to see—or see more of—in the data world in 2018. 1. New tools will make graphs and time series easier, leading to new use cases Graphs and time series have been a crucial part of the explosion in big data. 2018 will see the emergence of a new generation of tools for storing and analyzing graphs and time series at large scale. These new analytic and visualization tools will help product groups devise new offerings, especially for use cases in security and fraud detection. 2. More companies will join data partnerships to share data In 2016, I started hearing companies express interest in data sharing platforms, and startups have…
Original Post: What lies ahead for data in 2018

How machine learning will accelerate data management systems

Indexed (source: Stuart Caie on Flickr)Tim Kraska will speak on “Learned Index Structures”, at the AI Conference in New York, April 29 to May 2. Hurry—best price ends February 2. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode of the Data Show, I spoke with Tim Kraska, associate professor of computer science at MIT. To take advantage of big data, we need scalable, fast, and efficient data management systems. Database administrators and users often find themselves tasked with building index structures (“indexes” in database parlance), which are needed to speed up data access. Some common examples include: B-Trees—used for range requests (e.g., assemble all sales orders within a certain time frame) Hash maps—used for key-based lookups Bloom…
Original Post: How machine learning will accelerate data management systems

An introduction to regular expressions

Email regex (source: Pietrodn on Wikimedia Commons)Sign up for Thomas Nield’s live online training “Advanced SQL for Data Analysis” on Safari. Next course is December 19, 2017. Many data science, analyst, and technology professionals have encountered regular expressions at some point. This esoteric, miniature language is used for matching complex text patterns, and looks mysterious and intimidating at first. However, regular expressions (also called “regex”) are a powerful tool that only require a small time investment to learn. They are almost ubiquitously supported wherever there is data. Several analytical and technology platforms support them, including SQL, Python, R, Alteryx, Tableau, LibreOffice, Java, Scala, .NET, and Go. Major text editors and IDE’s like Atom Editor, Notepad++, Emacs, Vim, Intellij IDEA, and PyCharm also support searching files with regular expressions. The ubiquity of regular expressions must mean they offer universal utility, and,…
Original Post: An introduction to regular expressions

How enterprises can build a digital business platform with pervasive integration

One of Kusama’s first Mirrored Rooms (source: Helsinki Art Museum on Wikimedia Commons)For more insight, get the free ebook, “Integration and the Path to Becoming a Digital Business,” by Ciara Byrne. Every company is now a software company. Digital transformation allows even large enterprises to adapt to changes in markets and customers at lightning speed, responding with new products, new processes, and new business models. Digital transformation doesn’t just require new technology; it requires a new, more agile mindset. Every line of business must have access to the digital tools needed to innovate at the edge, and it’s the job of the core IT team to provide them. Digital transformation relies on connecting data and systems, people and processes. Integration technologies have traditionally formed the nervous system of a large enterprise, connecting systems and moving data. But the human nervous…
Original Post: How enterprises can build a digital business platform with pervasive integration