Data + AI Summit Europe 2020 原 Spark + AI Summit Europe 于2020年11月17日至19日举行。由于新冠疫情影响,本次会议和六月份举办的会议一样在线举办,一共为期三天,第一天是培训,第二天和第三天是正式会议。会议涵盖来自从业者的技术内容,他们将使用 Apache Spark™、Delta Lake、MLflow、Structured Streaming、BI和SQL分析、深度学习和机器学习框架来解决棘手的数据问题。会议的全部日程请参见:https://databricks.com/dataaisummit/europe-2020/agenda。
和今年六月份会议不一样,这次会议的 KeyNote 没什么劲爆的消息,不过会议的第二天和第三天还是有些干货大家可以看下的。在接下来的几天,本公众号也会对一些比较有意思的议题进行介绍,敬请关注本公众号。
本次会议的议题范围具体如下:
- 人工智能用户案例以及新的机会;
- Apache Spark™, Delta Lake, MLflow 等最佳实践和用户案例;
- 数据工程,包括流架构
- 使用数据仓库(data warehouse)和数据湖(data lakes)进行 SQL 分析和 BI;
- 数据科学,包括 Python 生态系统;
- 机器学习和深度学习应用
- 生产机器学习(MLOps)
- 大规模数据分析和ML研究
- 工业界的用户案例
下载途径
关注微信公众号 过往记忆大数据 或者 Java与大数据架构 并回复 iteblog-9902 获取。
可下载的PPT
下面议题提供 PPT 下载
- 3D: DBT using Databricks and Delta
- Accelerated Training of Transformer Models
- Achieving Lakehouse Models with Spark 3.0
- Acoustics & AI for Conservation
- Active Governance Across the Delta Lake with Alation
- Add Historical Analysis of Operational Data with Easy Configurations in Fivetran Automated Data Integration
- Advanced Natural Language Processing with Apache Spark NLP
- Apache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline
- Apache Spark Streaming in K8s with ArgoCD & Spark Operator
- Apply MLOps at Scale
- Arbitrary Stateful Aggregation and MERGE INTO
- Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
- Building a Cross Cloud Data Protection Engine
- Building a Distributed Collaborative Data Pipeline with Apache Spark
- Building a MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few Minutes
- Building a Real-Time Supply Chain View: How Gousto Merges Incoming Streams of Inventory Data at Scale to Track Ingredients Throughout its Supply Chain
- Building a SIMD Supported Vectorized Native Engine for Spark SQL
- Building a Streaming Data Pipeline for Trains Delays Processing
- Building a Streaming Microservices Architecture
- Building an ML Tool to predict Article Quality Scores using Delta & MLFlow
- Building Identity Graph at Scale for Programmatic Media Buying Using Apache Spark and Delta Lake
- Building Notebook-based AI Pipelines with Elyra and Kubeflow
- Building the Next-gen Digital Meter Platform for Fluvius
- CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks
- Cloud-native Semantic Layer on Data Lake
- Common Strategies for Improving Performance on Your Delta Lakehouse
- Comprehensive View on Date-time APIs of Apache Spark 3.0
- Containerized Stream Engine to Build Modern Delta Lake
- Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King
- Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS SageMaker for Enterprise AI Scenarios
- Cost Efficiency Strategies for Managed Apache Spark Service
- Data Engineers in Uncertain Times: A COVID-19 Case Study
- Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake
- Data Privacy with Apache Spark: Defensive and Offensive Approaches
- Data Time Travel by Delta Time Machine
- Data Time Travel by Delta Time Machine
- Data Versioning and Reproducible ML with DVC and MLflow
- Databricks University Alliance Meetup - Data + AI Summit EU 2020
- Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Customers
- Delta: Building Merge on Read
- Delta Lake: Optimizing Merge
- Designing and Implementing a Real-time Data Lake with Dynamically Changing Schema
- Detecting and Recognising Highly Arbitrary Shaped Texts from Product Images
- Deterministic Machine Learning with MLflow and mlf-core
- Developing ML-enabled Data Pipelines on Databricks using IDE & CI/CD at Runtastic
- Digital Turbine Adopts A Lakehouse to Scale to Their Analytics Needs
- Distributed and Scalable Model Lifecycle Capabilities
- Diving into Delta Lake: Unpacking the Transaction Log
- eBay’s Work on Dynamic Partition Pruning & Runtime Filter
- Efficient Query Processing Using Machine Learning
- Embedding Insight through Prediction Driven Logistics
- End to End Supply Chain Control Tower
- Extending Apache Spark – Beyond Spark Session Extensions
- Foundations of Data Teams
- Frequently Bought Together Recommendations Based on Embeddings
- From Query Plan to Query Performance: Supercharging your Apache Spark Queries using the Spark UI SQL Tab
- From Zero to Hero with Kafka Connect
- Generalized Pipeline Parallelism for DNN Training
- Getting Started with Apache Spark on Kubernetes
- Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
- How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark
- How The Weather Company Uses Apache Spark to Serve Weather Data Fast at Low Cost
- Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and Parquet Reader
- Introducing MLflow for End-to-End Machine Learning on Databricks
- Koalas: Interoperability Between Koalas and Apache Spark
- Leveraging Apache Spark and Delta Lake for Efficient Data Encryption at Scale
- Livestream Economy: The Application of Real-time Media and Algorithmic Personalisation in Urbanism
- Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
- MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestration of Machine Learning Pipelines
- MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams
- Migrate and Modernize Hadoop-Based Security Policies for Databricks
- Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
- ML Production Pipelines: A Classification Model
- ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed Feedback Environment
- MLflow at Company Scale
- MLOps Using MLflow
- Model Experiments Tracking and Registration using MLflow on Databricks
- Monitoring Half a Million ML Models, IoT Streaming Data, and Automated Quality Check on Delta Lake
- Moving to Databricks & Delta
- NLP Text Recommendation System Journey to Automated Training
- Operating and Supporting Delta Lake in Production
- Optimising Geospatial Queries with Dynamic File Pruning
- Optimizing Apache Spark UDFs
- Our Journey to Release a Patient-Centric AI App to Reduce Public Health Costs
- Parallel Ablation Studies for Machine Learning with Maggy on Apache Spark
- Personalization Journey: From Single Node to Cloud Streaming
- Photon Technical Deep Dive: How to Think Vectorized
- Polymorphic Table Functions: The Best Way to Integrate SQL and Apache Spark
- Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch and More!)
- Productionizing Real-time Serving With MLflow
- Project Zen: Improving Apache Spark for Python Users
- Query or Not to Query? Using Apache Spark Metrics to Highlight Potentially Problematic Queries
- Ray and Its Growing Ecosystem
- Real-time Feature Engineering with Apache Spark Streaming and Hof
- Real-Time Health Score Application using Apache Spark on Kubernetes
- Reproducible AI Using PyTorch and MLflow
- Reproducible AI Using PyTorch and MLflow
- Revealing the Power of Legacy Machine Data
- Scale and Optimize Data Engineering Pipelines with Software Engineering Best Practices: Modularity and Automated Testing
- Scale-Out Using Spark in Serverless Herd Mode!
- Scaling Machine Learning Feature Engineering in Apache Spark at Facebook
- Scaling Machine Learning with Apache Spark
- Seamless MLOps with Seldon and MLflow
- SHAP & Game Theory For Recommendation Systems
- Simplifying AI integration on Apache Spark
- Skew Mitigation For Facebook PetabyteScale Joins
- Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metadata Platform
- Spark NLP: State of the Art Natural Language Processing at Scale
- Spark SQL Beyond Official Documentation
- Spark SQL Join Improvement at Facebook
- Speeding Time to Insight with a Modern ELT Approach
- Stateful Streaming with Apache Spark: How to Update Decision Logic at Runtime
- Stories from the Financial Service AI Trenches: Lessons Learned from Building AI Models in EY
- Streaming Inference with Apache Beam and TFX
- TeraCache: Efficient Caching Over Fast Storage Devices
- The Beauty of (Big) Data Privacy Engineering
- The Hidden Value of Hadoop Migration
- The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analytics Engineer
- The Pill for Your Migration Hell
- Transforming GE Healthcare with Data Platform Strategy
- Trust, Context and, Regulation: Achieving More Explainable AI in Financial Services
- Unlocking Geospatial Analytics Use Cases with CARTO and Databricks
- Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update/Delete SQL Operation
- Using Machine Learning at Scale: A Gaming Industry Experience!
- Using Machine Learning at Scale: A Gaming Industry Experience!
- Using NLP to Explore Entity Relationships in COVID-19 Literature
- Using Redash for SQL Analytics on Databricks
- What is New with Apache Spark Performance Monitoring in Spark 3.0
- X-RAIS: The Third Eye
转载本文请加上:转载自过往记忆(https://www.iteblog.com/)
本文链接: 【Data + AI Summit 欧洲2020全部超清 PPT 下载】(https://www.iteblog.com/archives/9902.html)
来源:oschina
链接:https://my.oschina.net/u/4373067/blog/4777254