zoukankan      html  css  js  c++  java
  • 31 BIG DATA PLATFORMS YOU NEED TO KNOW

    https://builtin.com/big-data/big-data-platform

    t's unclear when plain old “data” became “big data," but the latter term probably originated in 1990s Silicon Valley pitch meetings and lunch rooms. What's easier to pinpoint is how data has exploded in the 21st century.

    The total amount of data recorded until 2003 was five exabytes, or one quintillion bytes. (A quintillion is a million, cubed.) In 2011 alone, recorded data weighed in at 1.8 zettabytes — almost a thousand times more. By 2020, according to one estimate, humans will produce on average 1.5 GB of data per day. Multiply that by 365 days and then again by a good chunk of the world's 7.5 billion-person population, and the volume is almost unfathomable immense.  

    BIG DATA ANALYTICS PLATFORMS TO KNOW

    • Microsoft Azure
    • Cloudera
    • Sisense
    • Collibra
    • Tableau
    • MapR
    • Qualtrics
    • Oracle
    • MongoDB
    • Datameer

    Because the persistent gush of data from numerous sources is only growing more intense, lots of sophisticated and highly scalable big data analytics platforms — many of which are cloud-based — have popped up to parse the ever expanding mass of information.

     
    Find out who's hiring.
    See jobs at top tech companies & startups

    We’ve rounded up the 31 big data platforms that make petabytes of data feel manageable.

    sumo logic big data platformSumo Logic

    SUMO LOGIC

    Location: Redwood City, Calif.

    What it does: The Cloud-native Sumo Logic platform offers apps — including Airbnb and Pokemon GO—three different types of support. It troubleshoots, tracks business analytics and catches security breaches, drawing on machine learning for maximum efficiency. It’s also flexible and able to manage sudden influxes of data. 

    microsoft azure big data platformMicrosoft Azure

    MICROSOFT AZURE

    Location: Seattle

    What it does: Users can analyze data stored on Microsoft’s Cloud platform, Azure, with a broad spectrum of open-source Apache technologies, including Hadoop and Spark. Azure also features a native analytics tool, HDInsight, that streamlines data cluster analysis and integrates seamlessly with Azure's other data tools. 

    cloudera big data platformCloudera

    CLOUDERA 

    Location: Palo Alto, Calif.

    What it does: Rooted in Apache’s Hadoop, Cloudera can handle massive amounts of data. Clients routinely store more than 50 petabytes in Cloudera’s Data Warehouse, which can manage data including machine logs, text, and more. Meanwhile, Cloudera’s DataFlow—previously Hortonworks’ DataFlow—analyzes and prioritizes data in real time.

    shyft data and analytics big data platformSHYFT Data and Analytics Platform

    SHYFT DATA AND ANALYTICS PLATFORM

    Location: Boston

    What it does: SHYFT designed its data analytics platform with the life science industries in mind. While keeping patient privacy in mind, the HIPAA- and PII-compliant tool automatically finds, imports and runs analytics on hundreds of data streams, blending them into a cohesive whole. The platform’s quick and slick data visualizations help users uncover unexpected correlations between datasets. 

    google cloud big data platformGoogle Cloud

    GOOGLE CLOUD

    Location: Mountain View, Calif.

    What it does: Google Cloud offers lots of big data management tools, each with its own specialty. BigQuery warehouses petabytes of data in an easily queried format. Cloud Dataflow analyzes ongoing data streams and batches of historical data side by side. With Google Data Studio, clients can turn varied data into custom graphics.

    sisense big data platformSisense

    SISENSE 

    Location: NYC

    What it does: Sisense’s data analytics platform processes data swiftly thanks to its signature In-Chip Technology. The interface also lets client build, use and embed custom dashboards and analytics apps. After a recent merger, Sisense is poised to combine its platform with Periscope Data’s. The merger will allow users to simultaneously comb data repositories with SQL, Python and R.

    collibras data governance center big data platformCollibra

    COLLIBRA

    Location: NYC

    What it does: Designed to accommodate the needs of banking, healthcare and other data-heavy fields, Collibra lets employees companywide find quality, relevant data. The versatile platform features semantic search, which can find more relevant results by unraveling contextual meanings and pronoun referents in search phrases.

    talend big data platformTalend

    TALEND

    Company location: San Francisco

    What the platform does: Talend’s trio of big data integration platforms includes a free basic platform and two paid subscription platforms, all rooted in open-source tools like Apache Spark. The paid platforms, though—one designed for existing data, the other for real-time data streams—come with more power and tech support. Both can clean and parse data, delete duplicate data and detect fraud automatically, among other functions.

    tableau big data platformTableau

    TABLEAU

    Company location: Austin, Texas

    What the platform does: The Tableau platform—available on-premises or in the Cloud—allows users to find correlations, trends and unexpected interdependences between data sets. The Data Management add-on further enhances the platform, allowing for more granular data cataloging and the tracking of data lineage.

    mapr big data platformMapR

    MAPR 

    Company location: Santa Clara, Calif.

    What the platform does: MapR’s platform, which they term "dataware," has attracted customers like American Express and Samsung with its massive capacity (exabytes!) and robust security measures. But it's not a platform so much as a meta-platform—a dashboard for managing big data spread across various platforms, clouds, servers and edge-computing devices. Its interface offers users a 10,000-foot perspective on the totality of their data while letting them manage various data types in one place. 

    qualtrics experience management big data platformQualtrics Experience Management

    QUALTRICS EXPERIENCE MANAGEMENT

    Location: Seattle 

    What it does: Qualtrics’ platform lets companies assess the four key experiences that define their brand: customer experience; employee experience; product experience; and the brand experience, defined by marketing and brand awareness. Its analytics tools turn data on employee satisfaction, marketing campaign impact and more into actionable predictions rooted in machine learning and AI.

    1010 datas 1010 edge big data platform1010 Data's 1010 Edge

    1010DATA’S 1010EDGE

    Location: NYC

    What it does: This scalable cloud-based big data platform compiles and unifies data for giant enterprises, including Bank of America and Coca-Cola. Along the way, it can pull in relevant third-party data — like conversion rates and buyer behavior intel — from 1010Reveal. The searchable platform efficiently processes multiple complex queries at once.

    teradata big data platformTeradata

    TERADATA

    Location: San Diego, Calif.

    What it does: Teradata’s Vantage analytics software works with various public cloud services, but users can also combine it with Teradata Cloud storage. This all-Teradata experience maximizes synergy between cloud hardware and Vantage’s machine learning and NewSQL engine capabilities. Teradata Cloud users also enjoy special perks—new Vantage features, for instance, are available on Teradata’s cloud before they're available to users of other cloud services.

    oracle-big-data-platformORacle

    ORACLE 

    Company location: Westminster, Colo.

    What the platform does: Oracle Cloud’s big data platform can automatically migrate diverse data formats to cloud servers, purportedly with no downtime. The platform can also operate on-premise and in hybrid settings, enriching and transforming data whether it’s streaming in real time or stored in a centralized repository, aka "data lake." The platform comes in three formats, including basic and governance editions.

    domo big data platformDomo

    DOMO

    Company location: American Fork, Utah

    What the platform does: Domo’s big data platform draws on clients’ full data portfolios to offer industry-specific findings and AI-based predictions. Even when relevant data sprawls across multiple cloud servers and hard drives, Domo clients can gather it all in one place with Magic ETL, a drag-and-drop tool that streamlines the integration process.

    mongodb-big-data-platformMongoDB

    MONGODB

    Location: NYC

    What it does: MongoDB doesn’t force data into spreadsheets. Instead, its Cloud-based platforms store data as flexible JSON documents—in other words, as digital objects that can be arranged in a variety ways, even nested inside each other. Designed for app developers, the platforms offer of-the-moment search functionality. For example, users can search their data for geotags and graphs as well as text phrases. 

    civis platform big data platformCivis Platform

    CIVIS PLATFORM

    Location: Chicago

    What it does: Civis Analytics’ cloud-based platform offers end-to-end data services, from data ingestion to modeling and reports. Designed with data scientists in mind, the platform integrates with GitHub to ease user collaboration and is purportedly ultra-secure—both HIPAA-compliant and SOC 2 Type II-certified. 

    alteryx big data platform

    ALTERYX

    Company location: Broomfield, Colo.

    What the platform does: Alteryx’s designers built the company’s eponymous platform with simplicity and interdepartmental collaboration in mind. Its four interlocking tools allow users to create repeatable data workflows — stripping busywork from the data prep and analysis process— and deploy R and Python code within the platform for quicker predictive analytics.

    zeta interactives marketing platform big data platformZeta Interactive's Marketing Platform

    ZETA INTERACTIVE’S MARKETING PLATFORM 

    Location: NYC

    What it does: Designed for marketers, this platform from Zeta Interactive pulls data from three different clouds onto one dashboard. (One cloud is devoted to marketing, another to customer experience and a third to in-depth customer data culled from millions of user profiles with permission.) The platform’s AI features sift through the diverse data, helping marketers target key demographics and attract new customers. 

    hewlett packard enterprises big data platformHewlett Packard Enterprise's Vertica

    HEWLETT PACKARD ENTERPRISE’S VERTICA 

    Location: Palo Alto, Calif.

    What it does: This software-only SQL data warehouse is storage system-agnostic. That means it can analyze data from cloud services, on-premise servers and any other data storage space. Vertica works quickly thanks to columnar storage, which facilitates the scanning of only relevant data. Its latest version offers predictive analytics rooted in machine learning for industries that include finance and marketing.

    arm treasure data big data platformArm Treasure Data

    ARM TREASURE DATA

    Location: Mountain View, Calif.

    What it does: Treasure Data’s customer data platform sorts morasses of web, mobile and IoT data into rich, individualized customer profiles so marketers can communicate with their desired demographics in a more tailored and personalized way. 

    amazon web services big data platformAmazon Web Services

    AMAZON WEB SERVICES

    Location: Seattle

    What it does: Best known as AWS, Amazon’s cloud-based platform comes with 11 analytics tools that are designed for everything from data prep and warehousing to SQL queries and data lake design. All the resources scale with your data as it grows in a secure cloud-based environment. Features include customizable encryption and the option of a virtual private cloud

     
    Find out who's hiring.
    See all jobs at top tech companies & startups

    actian avalanche big data platformActian Avalanche

    ACTIAN AVALANCHE

    Location: San Francisco

    What it does: Actian’s Cloud-native data warehouse, which debuted in March 2019, was built for near-instantaneous results — even if users run multiple queries at once. Backed by support from Microsoft and Amazon’s public clouds, it can analyze data in public and private Clouds. For easy app use, the platform comes with ready-made connections to Salesforce, Workday and others. 

    pivotal greenplum big data platformPivotal Greenplum

    PIVOTAL GREENPLUM 

    Location: San Francisco

    What it does: Born out of the open-source Greenplum Database project, this platform uses PostgreSQL to conquer varied data analysis and operations projects, from quests for business intelligence to deep learning. Pivotal Greenplum can parse data housed in clouds and servers, as well as container orchestration systems. Additionally, it comes with a built-in toolkit of extensions for location-based analysis, document extraction and multi-node analysis.

    hitachi vantaras pentaho big data platformsHitachi Vantara's Pentaho

    HITACHI VANTARA’S PENTAHO

    Location: Orlando, Fla.

    What it does: This platform streamlines the data ingestion process by foregoing hand coding and offering time-saving functions like drag-and-drop integration, pre-made data transformation templates and metadata injection. Once users add data, the platform can mine business intelligence from any data format thanks to its data-agnostic design.

    exasol big data platformExasol

    EXASOL

    Location: Nuremberg, Germany

    What it does: This intelligent, in-memory analytics database was designed for speed, especially on clustered systems. It can analyze all types of data — including sensor, online transaction, location and more — via massive parallel processing. The cloud-first platform also analyzes data stored in appliances and can function purely as software.

    ibm cloud big data platformIBM Cl0ud

    IBM CLOUD

    Location: Armonk, N.Y.

    What it does: IBM’s full-stack cloud comes with 170 built-in tools, including more than 20 for customizable big data management. Users can opt for a NoSQL or SQL database, or store their data as JSON documents, among other database designs. The platform can also run in-memory analysis and integrate open-source tools like Apache Spark. 

    marklogic big data platformMark Logic

    MARKLOGIC

    Location: San Carlos, Calif.

    What it does: Users can import data into MarkLogic’s platform as is. Items ranging from images and videos to JSON and RDF files coexist peaceably in the flexible database, uploaded via a simple drag-and-drop process powered by Apache Nifi. Organized around MarkLogic’s Universal Index, files and metadata are easily queried. The database also integrates with a host of more intensive analytics apps.

    datameer big data platformDatameer

    DATAMEER

    Location: San Francisco, Calif.

    What it does: Though it’s possible to code within Datameer’s platform, it’s not particularly necessary. Users can upload structured and unstructured data directly from more than 70 data sources by following a simple wizard. From there, the point-and-click data cleansing and built-in library of more than 270 functions — like chronological organization and custom binning —make it easy to drill into data even if users don't have a computer science background.

    wavefront big data platformWavefront

    WAVEFRONT

    Location: Palo Alto, Calif.

    What it does: Designed for time-series data pulled from the likes of CollectD, JMX and Amazon Web Services, this platform specializes in spotting trends — and, more important, deviations from them. The latter capacity means that when something suspicious happens, users can send and receive intelligent alerts, activated by multi-dimensional criteria rather than simplistic thresholds. 

    alibaba cloud big data platformAlibaba Cloud

    ALIBABA CLOUD

    Location: Hangzhou, Zhejiang, China

    What it does: The leading public cloud provider in China, Alibaba operates in 19 regions worldwide, including the U.S. Its popular cloud platform offers a variety of database formats and big data tools, including data warehousing, analytics for streaming data and speedy Elasticsearch, which can scan petabytes of data scattered across hundreds of servers in real time.

    Images via Shutterstock, social media and company websites.

  • 相关阅读:
    spring注解之@Lazy
    HttpClient之EntityUtils对象
    HTTP协议(Requset、Response)
    SpringBoot SpringSession redis SESSION
    Spring-session redis 子域名 session
    Spring Boot Servlet
    版本管理
    Spring AOP @Aspect
    Spring 事务配置的五种方式
    Spring <tx:annotation-driven>注解 JDK动态代理和CGLIB动态代理 区别。
  • 原文地址:https://www.cnblogs.com/dhcn/p/13032046.html
Copyright © 2011-2022 走看看