08 May 2017

StreamAnalytix Spark Streaming Contest: Real-time Apps Built for Anomaly Detection with Apache Spark

At Impetus, we take data analytics innovation seriously. Very seriously. And one of the ways we continue to improve our software products and services, as well as retain our industry leadership standard, is through community programs that empower users to explore innovative uses for analytics technologies with our real-time streaming software, StreamAnalytix.

One of our programs was the inaugural Spark Streaming Innovation Contest, an international data hackathon that drew roughly 600 participants from around the world with a grand prize of $10,000 for the best submission. Held from February through April, we opened the contest to the general community, calling on business analysts and engineers to solve real-world anomaly detection problems.

Because hackathon participants vary in skill level and experience, we outfitted them with two tools. Apache Spark and StreamAnalytix. We wanted them to be able to access their data quickly while eliminating the need to build complicated models to gain insights.

Apache(R) Spark ™ is the most popular stream processing engine due to its open source framework, powerful programming model, and advanced analytics capabilities. However, Spark typically requires a lot of setup, coding and modeling; therefore, we equipped users with StreamAnalytix, a development platform that enables users to create real-time stream processing and machine learning applications.

StreamAnalytix makes anomaly detection on Apache Spark extremely easy, allowing developers to leverage their data quickly and spend their time gaining insights instead of programming. With these tools in hand, hackathon participants could build anomaly detection applications quickly, even without prior experience of using StreamAnalytix.

A panel of experts, including the StreamAnalytix product team, architects and engineers, as well as Alex Woodie, managing editor of Datanami and Mike Matchett, senior analyst and consultant at Taneja Group, evaluated and scored each submission.

Perhaps one of the most shocking discoveries we made is that this year’s winners weren’t even veteran data scientists. “I wouldn’t call myself a data science expert,” said Venu Kanaparthy of Redlands, California. Kanaparthy won the grand prize of $10,000 with his machine learning application for anomaly detection using Spark. Despite his limited experience, he says that he “was able to build a fully functional anomaly detection application on Spark working part-time evenings over about 4 weeks.”

A total of $18,000 was awarded in prize money, including two runners up. The First runner-up (awarded $5,000) was Anindya Saha from Foster City, California. The second runner-up (awarded $3,000) was Kalyan Janaki from Denver, Colorado. We congratulate our winners and are already looking forward to next year’s competition.

Powerful User Experiences

Having a powerful tool at your fingertips is one thing, but having a tool that is easy and exciting to use leads to better data analysis. And we know how important UX is. So, when the hackathon ended, we surveyed the participants about their experiences.

One of the biggest complaints we’ve come across from data analysts and data engineers in the field is that there are many tools in their toolkit, but they lack good user experiences. Good UX is innovation and is at the core of everything we do, so we listened carefully.

We were delighted to hear that they found the platform “very intuitive.” In fact, most commented on how easy it is to use, saying that ““even a business analyst who has some familiarity with Spark can easily create and run Spark-based machine learning pipelines.” The main advantage of the platform, said one user, is the ability to “develop and deploy pipelines with no or less coding.” Overall, contestants agreed that StreamAnalytix is as valuable for business analysts as it is for data scientists and developers.

We are inspired. And we will create more programs like these. What about you? Ready to build a real-time streaming application?

30 Aug 2017

Aragon Research Recognizes StreamAnalytix as One of Four 2017 Hot Vendors in the Streaming Analytics Space

Every time an independent research firm identifies StreamAnalytix as a leading platform in the increasingly competitive space of streaming analytics, it is an exciting moment for us. Inclusion in a recently published report by Aragon Research, a technology focused research and advisory firm, as one of the ‘Hot Vendors in Streaming analytics 2017’ is one such proud moment.

The fact that we are one of only four players covered in the report, makes it even more exciting. With this report, Aragon Research provides insight on new and noteworthy data management and streaming analytics providers. Each year, Aragon Research recognizes Hot Vendors across multiple markets that are doing something new or differently. They may have new technology that expands capabilities, a new strategy that opens up markets, or just a new way of doing business that makes them worth assessing.

The report validates our focus on the use of open source technologies such as Spark Streaming and Apache Storm for real- time data insight, and recommends evaluating StreamAnalytix to enterprises that need a single visual platform that leverages popular open source, platforms for streaming ETL and advanced analytics, and that is easy-to-use for business and technical users.

Streaming data represents new avenues for creating value, and enterprises are beginning to pay attention to this new source of competitive advantage.  The value is driven by new business insights from sensor data, web clickstreams, geolocation data, weather reports, market data, social media and other event streams. Often it is the combination of multiple streaming and static sources of data that reveals new powerful insights. However, the successful use of stream processing engines such as Apache Spark ™ to build such advanced analytical applications can be a challenge as it typically requires deep technical and data science skills.

Solution? StreamAnalytix ! A platform that makes creating real-time stream processing and machine learning applications on Apache Spark extremely easy. It now offers a Visual Spark Studio for development and life-cycle management of Apache Spark applications in both streaming and batch mode. Earlier this year, within a short span of six weeks, engineers who were even new to Apache Spark were able to build complex machine learning applications for anomaly detection leveraging the StreamAnalytix platform – as part of a contest that we had organised.

Back to the topic, and in closing…we feel very thankful and immensely encouraged by the Aragon Research recognition as a ‘Hot Vendor’ and the validation of the benefit we strive to bring to enterprises i.e. “powerful tooling and ease of use – over open source technologies”. For more information, read the full press release.

Also, you can sign up here for your free trial  to experience the ease and power of StreamAnalytix.


Aragon Research Disclaimer

Aragon Research does not endorse vendors, or their products or services that are referenced in its research publications, and does not advise users to select those vendors that are rated the highest. Aragon Research publications consist of the opinions of Aragon Research and Advisory Services organization and should not be construed as statements of fact. Aragon Research provides its research publications and the information contained in them “AS IS,” without warranty of any kind.

26 Jun 2018

Impetus Technologies at San Jose DataWorks Summit 2018

Thousands of data professionals descended on San Jose, California from June 17 – 21 for DataWorks Summit 2018. This year’s event witnessed more than 150 breakout sessions and keynotes from the leaders in the industry.
Impetus was a Diamond sponsor at the event.

One of the highlights this year was the keynote by Praveen Kankariya − Founder and CEO, Impetus Technologies. Praveen’s session explored the challenges in creating a unified view of enterprise data, an essential building block for information-driven decision making and the advent of an AI-driven future. You can watch the session here.

Experts from Impetus and our sister organization, Kyvos Insights, also hosted three breakout sessions featuring insights from Fortune 500 client case studies covering modern data management and advanced analytics:

  • How a major bank leveraged Apache Spark and StreamAnalytix to rapidly re-build their insider threat detection application: Anand Venugopal, Product Head – StreamAnalytix and Sr. Solutions Architect, Punit Shah spoke about a StreamAnalytix use case to transform an aging insider threat detection application for a major bank.
  • Migrating analytics to the cloud at Fannie Mae: Praveen Kankariya hosted the session presented by Kevin Bates, Vice President of Enterprise Data Strategy Execution at Fannie Mae. The presentation described the modernization of Fannie Mae’s analytics platform and corresponding full migration of its Netezza warehouse to the cloud.
  • BI on data at massive scale with instant response times at Verizon: Ajay Anand, Kyvos Insights Vice President of Product Management and Marketing hosted Arun Jinde, Sr. Technical Consultant for data warehousing at Verizon. He shared the success that Verizon has enjoyed using Kyvos for low latency OLAP processing on massive scale.


Impetus Technologies at San Jose DataWorks Summit 2018

Larry Pearson
Vice President, Strategic Accounts - Marketing
05 Sep 2018

Impetus Technologies at Strata Data Conference 2018

The most awaited data analytics event of the year — Strata Data Conference kicks off tomorrow.

Thousands of data professionals will come together for a three-day deep dive that will put the spotlight on emerging trends and best practices in the data industry.

Every year, the event witnesses the leading minds in the industry providing an insider’s perspective on data’s latest technologies—and the technical expertise needed to effectively put data strategies and implementations into action. This year, the event will cover topics on AI, predictive analytics, machine learning, IoT, security, data engineering, stream processing, cloud strategy, visualization, and more.

Impetus is a sponsor and exhibitor at the Strata Data Conference NY 2018.

Join us at booth #1335 as our experts showcase our innovations on data warehouse modernization technologies and provide immersive demos to walk you through the streaming analytics landscape of the future.

Don’t miss our exclusive session — Keys to Operationalize Enterprise 360 where Anand Raman, Vice President, Sales and Business Development will highlight the keys to achieving a single source of truth across the enterprise. The session will provide insights into the business and technology drivers for building a data-driven organization and highlight critical priorities, strategies, and processes to get to a unified data model.

Session details:

Topic: Keys to Operationalize Enterprise 360
Speaker: Anand Raman, Vice President, Sales and Business Development
Date: September 12, 2018
Time: 4:35 PM EST

Larry Pearson
Vice President, Strategic Accounts - Marketing
22 Jan 2019

Five Important Movements in the Data Analytics Landscape in 2019

Business is booming in the data industry. Investments have grown exponentially in recent years and according to industry experts, the trend is expected to continue. As data complexity rises and more uses for data analytics take the spotlight across industries, the demand for solutions to meet these challenges is expected to surge.

We’re witnessing five up-and-coming trends that stand out as increasingly important investment areas and concerns for organizations and IT leaders.

#1 Data Quality Management (DQM)

Gartner estimates that organizations lose an average of $15 million annually due to poor data quality. That explains why enterprises are focusing on the quality and the context in which the data is being interpreted. According to a survey by Business Application Research Center, data quality management will become a key priority for organizations in 2019.

DQM involves four steps:

  1. Data acquisition

  2. Implementing advanced data processes

  3. Data distribution

  4. Managing oversight data

#2 ML-based data governance

According to Gartner, organizations have started realizing that data governance is a necessity; however, they lack experience in implementing enterprise-wide governance programs with actual, tangible results. In 2019, organizations will focus more on data governance to strike a balance between data access and security. Machine learning based data preparation tools will help in governance and reinstate trust and reliability in analytics practices.

#3 More investment in hybrid cloud and AI

With businesses exploring options to shift from enterprise data warehouses to meet their demands and scale business operations, the open stack-based cloud platform is already popular. 2019 will witness increased interdependency between artificial intelligence and cloud. With many data warehouses already in the cloud, players such as AWS, Microsoft Azure, IBM Cloud, and Google Cloud Platform will expand their AI cloud portfolio to let enterprises deploy AI on the cloud.

#4 Unified view of data

With data coming in from multiple sources and in different formats at different speeds, it is becoming imperative for businesses to have control over their data. While a modern data warehouse or data lake helped enterprises bring all the data in one place, they are still struggling to offer a unified view. In 2019, two massive trends will shift the focus:

  1. Different vendors will come together to standardize data models, leading to more consistent formats for cloud-based data sources.

  2. Enterprises will build data catalogs, which will enable audit of the entire data warehouse architecture. More like a centralized hub that everyone within the enterprise can access, these catalogs will link enterprise data management with analytics.

#5 Augmented analytics

As data scientists struggle with vast amounts of data to process, businesses relying on traditional machine learning platforms often miss key real-time insights. Augmented analytics, which is based on machine learning (ML) and natural language processing (NLP), can enhance data analytics, data sharing, and business intelligence. It reduces the dependency on data scientists and can also overcome the lack of business expertise that data science teams possess.

In 2019, enterprises will use AI-powered augmented analytics tools to identify data sets, develop hypotheses, and identify data patterns automatically, reducing risk and accelerating error-free modernization.

Are you ready to meet the demand in 2019?

In 2018 the industry experienced radical changes in data storage, organization, and analysis. 2019 will see a faster increase in the number of companies, IT projects, vendors, solutions, and teams that store and process data to derive insights and realize ROI.

Simply put, the landscape is growing in ability, size, and complexity.

As the market shifts in the direction of maturity and complexity, it demands a new kind of data warehousing that is fundamental in enabling a unified, clear, and present view of your business.

We’re already fully engaged in gleaning the possibilities of data analytics and are continuously experiencing ongoing innovation in logical data warehousing.

We invite you to join the conversation around 2019 data warehouse trends. Contact us today.

Larry Pearson
Vice President, Strategic Accounts - Marketing