Personal Picks: Data Product News (July 9, 2025)

This content originally appeared on DEV Community and was authored by Sagara

This is the English translation of the following article.
https://dev.classmethod.jp/articles/modern-data-stack-info-summary-20250709/

This is Sagara.

As a consultant in the Modern Data Stack field, I see a vast amount of information being released every day.

With so much happening, I'd like to use this article to summarize the Modern Data Stack-related news that has caught my attention over the past couple of weeks.

Disclaimer: This article does not cover all the latest updates for the products mentioned. It only includes information that I found interesting, based on **my own selection and biases*.

General Modern Data Stack

The Data Engineer Toolkit: Infrastructure, DevOps, and Beyond

MotherDuck's blog has published an article explaining the advanced technical toolkit and concepts required for today's data engineers, covering data processing, infrastructure, DevOps, data quality, AI, and soft skills.

I found it to be a comprehensive overview of the tools a data engineer might need, and I believe many will find it useful.

https://motherduck.com/blog/data-engineering-toolkit-infrastructure-devops/

"The State of Data Teams" by Hex

Hex has released "The State of Data Teams," a report summarizing the results of a survey of 2,000 data leaders.

What particularly caught my eye were findings like: while 77% of people have high expectations for AI, only 3% of leaders are making AI a top priority, and 84% of data team leaders prioritize data quality and reliability above all else.

https://hex.tech/state-of-data-teams/

An Article on "Git for Data" in the Lakehouse Era

The Orchestra blog published an article titled "What does git for Data even look like anymore?" which discusses "Git for Data" in the context of the lakehouse era.

It explains the current state of applying Git workflows from software development to the world of data, particularly within lakehouse architectures based on Iceberg and S3.

It also introduces tools like Nessie, LakeFS, Bauplan, and Y42, highlighting how each takes a different approach to achieving version control for the data itself.

https://dataopsleadership.substack.com/p/what-does-git-for-data-even-look

Data Warehouse/Data Lakehouse

Snowflake

How to Access a Semantic View from Excel

An article on Medium explained how to access a Semantic View from Excel.

The ability for end-users to reference metrics defined in code within a Semantic View directly from Excel strikes me as a very interesting use case!

https://medium.com/@angelamarieharney/query-a-snowflake-semantic-model-from-excel-fd79279ac8c9

The latest version of terraform-provider-snowflake, "v2.3.0," has been released.

Version 2.3.0 of the terraform-provider-snowflake has been released.

The support for PROGRAMMATIC_ACCESS_TOKEN is a welcome addition.

https://github.com/snowflakedb/terraform-provider-snowflake/releases/tag/v2.3.0

"IMPACT": The App Built with Snowflake & Streamlit to Measure the Commercial Impact of Experiments at Canva

Canva has published an article about "IMPACT," an app they built using Snowflake and Streamlit to measure the commercial impact of their experiments.

Canva runs thousands of experiments annually, but the process of measuring their commercial impact (on MAU, ARR, etc.) was manual, inefficient, and inconsistent across teams.

To solve this, they developed the in-house app "IMPACT" using Snowflake and Streamlit. This enables them to quickly perform self-service estimations of an experiment's impact, reducing analysis time from hours to less than 10 minutes.

https://www.canva.dev/blog/engineering/measuring-commerical-impact-at-scale/

MotherDuck/DuckDB

DuckLake ver 0.2 has been released

The latest version of DuckLake, ver 0.2, is now out.

This is a great update with many improvements, including a Secret feature for managing DuckLake credentials, a Settings feature to configure Parquet file compression and row group sizes at the global, schema, or table level, and a new three-tiered structure (global/schema/table) for managing Parquet file storage paths.

https://duckdb.org/2025/07/04/ducklake-02.html

MotherDuck Launches Managed Service for DuckLake in Preview

MotherDuck has started offering a managed service for DuckLake in preview.

It seems to offer flexible options, such as choosing whether to store data in MotherDuck's managed storage or the user's own cloud storage, and whether to use MotherDuck's resources or a local DuckDB for compute.

https://motherduck.com/blog/announcing-ducklake-support-motherduck-preview/

AI Development Using DuckDB, uv, and Cursor

MotherDuck's blog has published an article on AI development methods using DuckDB, uv, and Cursor.

I was personally impressed by the technique of using Python to get the DuckDB schema, outputting it as XML, and feeding it as context to Cursor.

https://motherduck.com/blog/vibe-coding-sql-cursor/

Data Transform

dbt

"Tokyo dbt Meetup #15" was held on the theme of dbt x LLM

On July 3, 2025, the "Tokyo dbt Meetup #15" was held, focusing on the theme of dbt x LLM. Given the recent trends in LLM technology, this was a hot topic for many! (Unfortunately, I couldn't attend in person, but it seems to have been a great success, with a live coding demo of Claude Code x dbt x Snowflake by Hishinuma-san.)

https://www.meetup.com/tokyo-dbt-meetup/events/307976885/?eventOrigin=group_past_events

A report blog about the event and the presentation slides from Ubie have been published, so I'll share the links here.

https://zenn.dev/shinyaa31/articles/6c3ebfebc4831a

https://speakerdeck.com/yoshyum/dbtmin-zhu-hua-tollmniyorukai-fa-busuto-ai-readynafen-xi-saikuruwomu-zhi-site

dbt Fusion now supports BigQuery

dbt Fusion, currently in Beta, has started supporting BigQuery! This means it now supports Snowflake, Databricks, and BigQuery.

According to the timeline at the link below, support for Redshift is also planned for the end of July.

https://github.com/dbt-labs/dbt-fusion?tab=readme-ov-file#timeline

Trying out an existing dbt Core 1.9 repository with dbt Fusion and the official dbt VS Code extension

Shameless plug for my own article, but I tried running an existing repository that was on dbt Core 1.9 with dbt Fusion and the official dbt VS Code extension and wrote about my experience.

Although I had gained some knowledge about dbt Fusion from webinars and official blogs beforehand, actually trying it out gave me a developer experience that was even better than I had imagined with traditional dbt. I was truly impressed!

https://dev.classmethod.jp/articles/dbt-core-1-9-to-dbt-fusion-with-vs-code-extension/

Coalesce

Coalesce announces support for Databricks

Coalesce has announced new support for Databricks. Previously, it only supported Snowflake, but now it's expanding its product compatibility.

https://docs.coalesce.io/updates#publications/databricks-support-now-available

Business Intelligence

Looker

"Looker Continuous Integration": A Native CI Feature Released in Looker 25.10

Shameless plug for my own article again, but I tried out the new "Looker Continuous Integration" feature released in Looker 25.10 in June and wrote about it in a blog post.

It's important to note that enabling this feature currently means your data will be stored in the United States. However, by enabling this CI feature, you can prevent breaking dashboards in your production environment and focus more on development!

https://dev.classmethod.jp/articles/looker-25-10-continuous-integration/

Data Catalog

OpenMetadata

OpenMetadata 1.8 has been released

The latest version of OpenMetadata, "1.8," has been released. The SaaS version has also been updated to 1.8.

https://blog.open-metadata.org/announcing-openmetadata-1-8-948eb14d41c7

https://blog.getcollate.io/announcing-collate-18

The release of the MCP Server, which is integrated into the OpenMetadata server, is a very exciting update! The demo video at the link below shows the steps for setting up OpenMetadata's MCP Server in Claude Desktop.

https://www.youtube.com/watch?v=7ryQhpquL9c

Data Orchestration

Dagster

The latest version of Dagster, "1.11," has been released

Dagster version 1.11 is now available. Updates include the Components feature becoming stable, the CLI dg becoming stable, and project/workspace setup via the create-dagster command.

https://dagster.io/blog/dagster-1-11-build-me-up-buttercup

https://github.com/dagster-io/dagster/releases/tag/1.11.0

This content originally appeared on DEV Community and was authored by Sagara

Print Share Comment Cite Upload Translate Updates

APA

Sagara | Sciencx (2025-07-09T02:18:49+00:00) Personal Picks: Data Product News (July 9, 2025). Retrieved from https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/

MLA

" » Personal Picks: Data Product News (July 9, 2025)." Sagara | Sciencx - Wednesday July 9, 2025, https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/

HARVARD

Sagara | Sciencx Wednesday July 9, 2025 » Personal Picks: Data Product News (July 9, 2025)., viewed ,<https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/>

VANCOUVER

Sagara | Sciencx - » Personal Picks: Data Product News (July 9, 2025). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/

CHICAGO

" » Personal Picks: Data Product News (July 9, 2025)." Sagara | Sciencx - Accessed . https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/

IEEE

" » Personal Picks: Data Product News (July 9, 2025)." Sagara | Sciencx [Online]. Available: https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/. [Accessed: ]

rf:citation

» Personal Picks: Data Product News (July 9, 2025) | Sagara | Sciencx | https://www.scien.cx/2025/07/09/personal-picks-data-product-news-july-9-2025/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.