GitHub succeeded because it created tools that developers really wanted and used. The combination of Git's technical features and GitHub's social features made it very popular.
The analytics and data workflow still lag behind traditional development methods. It's important to find better ways to show the value of data to businesses.
There's a new way to think about pricing that considers what buyers really want, not just traditional methods. This can lead to smarter pricing strategies.
Data bugs can be costly for companies, with bad data potentially costing up to 25% of their revenue. These issues often arise from problems in data-centric systems like dbt.
Using dbt allows data engineers to implement software practices like version control and testing, helping to ensure the correctness of their data transformations. However, relying solely on post-processing tests has its limits.
Manual spot checks are still crucial in ensuring data accuracy during code reviews. Tools like Recce aim to streamline this process, making it easier for developers to validate and document their changes.
Data engineering can be as efficient as software development with AI-assisted tools.
Coding assistants like GitHub Copilot enhance productivity by reducing the need for external references or tools.
Data systems face challenges in achieving a copilot-like experience due to the dynamic nature of correctness and the reliance on upstream data semantics.
dbt Labs is expanding its features to create a more unified data platform. This means users won’t need multiple tools since dbt can handle many basic data needs.
Applying software development practices to data workflows can be tricky. The way we test data is different, and adopting these practices hasn’t been easy for everyone.
Recce is designed to improve the software development workflow for data. It helps users validate changes easily and ensures everyone understands what correctness means in the data context.