apache/arrow-datafusion: Apache Arrow DataFusion and Ballista query engines [https://github.com/apache/arrow-datafusion] - 2021-08-16 01:31:11 - public:aguynamedryan data, etl, pipeline, rust - 4 | id:747700 -
Running Awk in parallel to process 256M records [https://ketancmaheshwari.github.io/posts/2020/05/24/SMC18-Data-Challenge-4.html] - 2020-06-05 17:22:49 - public:aguynamedryan awk, data, etl - 3 | id:321757 -
TXR Language [https://www.nongnu.org/txr/] - 2020-04-19 18:46:37 - public:aguynamedryan data, etl, try - 3 | id:309584 -
What's new in Kiba ETL v3 (visually explained) [https://thibautbarrere.com/2020/03/05/new-in-kiba-etl-v3] - 2020-03-12 23:01:49 - public:aguynamedryan data, etl, kiba, library, ruby - 5 | id:290722 -
thbar/kiba: Data processing & ETL framework for Ruby [https://github.com/thbar/kiba] - 2020-02-20 17:17:41 - public:aguynamedryan etl, ruby - 2 | id:283159 -
The Rise and Fall of the OLAP Cube [https://www.holistics.io/blog/the-rise-and-fall-of-the-olap-cube/] - 2020-01-31 20:48:25 - public:aguynamedryan db, etl, olap, rdbms - 4 | id:279055 -
Starting out with data puddles, then we’ll think about data lakes [https://medium.com/comic-relief/starting-out-with-data-puddles-then-well-think-about-data-lakes-f103111946db] - 2020-01-31 20:47:54 - public:aguynamedryan db, etl - 2 | id:279054 - Doing data right is time-consuming and hard! There you go the secret is out. But can we make it easier? Surely that is just part of engineering 101 and we should just accept it, right? The issue for…
GNU Recutils [https://labs.tomasino.org/gnu-recutils/] - 2020-01-31 20:46:16 - public:aguynamedryan cli, db, etl, unix - 4 | id:279051 -
Building Serverless Data Pipelines on Amazon Redshift By Writing SQL with Datacoral | Amazon Web Services [https://aws.amazon.com/blogs/apn/building-serverless-data-pipelines-on-amazon-redshift-by-writing-sql-with-datacoral/] - 2019-10-11 19:13:36 - public:aguynamedryan etl, pipeline, serverless - 3 | id:279007 -