datafuselabs/datafuse: An elastic and scalable Cloud Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy

[https://github.com/datafuselabs/datafuse/] - 2021-08-16 01:32:14 - public:aguynamedryan

cloud, data, rust - 3 | id:747701 -

apache/arrow-datafusion: Apache Arrow DataFusion and Ballista query engines

[https://github.com/apache/arrow-datafusion] - 2021-08-16 01:31:11 - public:aguynamedryan

data, etl, pipeline, rust - 4 | id:747700 -

q - Text as Data

[https://harelba.github.io/q/] - 2021-06-14 15:33:23 - public:aguynamedryan

cli, csv, data, sql, tools, try - 6 | id:684375 -

data-cleaning/validate: Professional data validation for the R environment

[https://github.com/data-cleaning/validate] - 2021-05-23 03:44:53 - public:aguynamedryan

data, package, R, try - 4 | id:684109 -

ropensci/skimr: A frictionless, pipeable approach to dealing with summary statistics

[https://github.com/ropensci/skimr] - 2021-05-23 03:44:32 - public:aguynamedryan

data, package, R - 3 | id:684108 -

choonghyunryu/dlookr: Tools for Data Diagnosis, Exploration, Transformation

[https://github.com/choonghyunryu/dlookr] - 2021-05-23 03:43:59 - public:aguynamedryan

data, package, R - 3 | id:684107 -

data-cleaning/dcmodify: Modify data records using separately defined modification rules

[https://github.com/data-cleaning/dcmodify] - 2021-05-23 03:43:08 - public:aguynamedryan

data, package, R - 3 | id:684106 -

data-cleaning/deductive: Methods for deductive data correction and imputation

[https://github.com/data-cleaning/deductive] - 2021-05-23 03:42:42 - public:aguynamedryan

data, package, R - 3 | id:684105 -

data-cleaning/errorlocate: Find and replace erroneous fields in data using validation rules

[https://github.com/data-cleaning/errorlocate] - 2021-05-23 03:41:43 - public:aguynamedryan

data, package, R - 3 | id:684104 -

Introducing Amazon S3 Object Lambda – Use Your Code to Process Data as It Is Being Retrieved from S3 | AWS News Blog

[https://aws.amazon.com/blogs/aws/introducing-amazon-s3-object-lambda-use-your-code-to-process-data-as-it-is-being-retrieved-from-s3/] - 2021-04-16 17:07:32 - public:aguynamedryan

aws, data, s3 - 3 | id:683416 -

Using S3 Object Lambdas to Generate and Transform on the fly | by Eoin Shanaghy | Mar, 2021 | Medium

[https://eoins.medium.com/using-s3-object-lambdas-to-generate-and-transform-on-the-fly-874b0f27fb84] - 2021-04-01 02:42:07 - public:aguynamedryan

data, serverless - 2 | id:678582 -

A Data Pipeline Is a Materialized View | Hacker News

[https://news.ycombinator.com/item?id=26217911&utm_term=comment] - 2021-03-01 03:40:03 - public:aguynamedryan

data, pipeline - 2 | id:574069 -

Estuary Flow (Preview) — Estuary Flow (Preview) documentation

[https://estuary.readthedocs.io/en/latest/README.html] - 2021-03-01 03:39:33 - public:aguynamedryan

data, pipeline, python - 3 | id:574068 -

Building Rich Terminal Dashboards | Hacker News

[https://news.ycombinator.com/item?id=26149488&utm_term=comment] - 2021-03-01 03:30:59 - public:aguynamedryan

cli, data - 2 | id:574062 -

Show HN: I wrote a book about using data science to solve “everyday” problems | Hacker News

[https://news.ycombinator.com/item?id=26253281&utm_term=comment] - 2021-03-01 03:28:57 - public:aguynamedryan

data, programming - 2 | id:574061 -

Hierarchical Structures in PostgreSQL

[https://hoverbear.org/blog/postgresql-hierarchical-structures/] - 2021-01-20 22:19:40 - public:aguynamedryan

data, pg - 2 | id:488445 -

Datasette: An open source multi-tool for exploring and publishing data

[https://datasette.io/] - 2020-12-23 19:21:20 - public:aguynamedryan

data, programming, try - 3 | id:485248 -

Using PostgreSQL and SQL to Randomly Sample Data

[https://info.crunchydata.com/blog/randomly-sampling-data-using-sql-and-postgresql] - 2020-10-28 16:22:58 - public:aguynamedryan

data, pg, stats, try - 4 | id:436596 -

Wikidata:SPARQL query service/A gentle introduction to the Wikidata Query Service - Wikidata

[https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/A_gentle_introduction_to_the_Wikidata_Query_Service#A_gentle_introduction_to_the_Wikidata_Query_Service] - 2020-10-26 16:57:54 - public:aguynamedryan

api, data, information model, ontology - 4 | id:426300 -

Simple Anomaly Detection Using Plain SQL | Haki Benita

[https://hakibenita.com/sql-anomaly-detection] - 2020-09-25 17:45:53 - public:aguynamedryan

data, rdbms - 2 | id:388117 -

DuckDB - An embeddable SQL OLAP database management system

[https://duckdb.org/] - 2020-08-21 15:29:03 - public:aguynamedryan

data, try - 2 | id:366387 -

mlin/GenomicSQLite: Genomics Extension for SQLite

[https://github.com/mlin/GenomicSQLite] - 2020-08-21 15:28:40 - public:aguynamedryan

data, sql, try - 3 | id:366386 -

Running Awk in parallel to process 256M records

[https://ketancmaheshwari.github.io/posts/2020/05/24/SMC18-Data-Challenge-4.html] - 2020-06-05 17:22:49 - public:aguynamedryan

awk, data, etl - 3 | id:321757 -

TXR Language

[https://www.nongnu.org/txr/] - 2020-04-19 18:46:37 - public:aguynamedryan

data, etl, try - 3 | id:309584 -

What's new in Kiba ETL v3 (visually explained)

[https://thibautbarrere.com/2020/03/05/new-in-kiba-etl-v3] - 2020-03-12 23:01:49 - public:aguynamedryan

data, etl, kiba, library, ruby - 5 | id:290722 -

thewhitetulip/awk-anti-textbook: learn awk by example

[https://github.com/thewhitetulip/awk-anti-textbook] - 2020-03-09 18:10:46 - public:aguynamedryan

awk, cli, data - 3 | id:285274 -

Preparing your Postgres data for scale-out - DEV Community

[https://dev.to/heroku/preparing-your-postgres-data-for-scale-out-km] - 2020-02-26 16:23:22 - public:aguynamedryan

data, pg, scale, shard - 4 | id:283237 -

In Loving Memory of Strictly-Typed Schemas - ssense-tech - Medium

[https://medium.com/ssense-tech/in-loving-memory-of-strictly-typed-schemas-89ae6e186202] - 2020-02-21 19:38:53 - public:aguynamedryan

data, db, nosql, thinkpiece - 4 | id:283175 -

Command Line Tricks For Data Scientists

[https://kadekillary.work/post/cli-4-ds/] - 2019-10-04 21:42:44 - public:aguynamedryan

cli, data, tools - 3 | id:277799 -

yabs.io

Yet Another Bookmarks Service

Viewing aguynamedryan's Bookmarks