Changelog

All notable changes to this project will be documented in this file, where appropriate the GitHub issue reference will be noted along with the change. Breaking changes will be clearly indicated with the icon.

The format is based on Keep a Changelog.

[0.16.4] - UNRELEASED

Added

[#1813] ASCII, CHAR, LPAD and RPAD functions @joocer
[#998] Initial HTTP function relation @joocer
[#1838] JSONB_OBJECT_KEYS function @joocer

Fixed

[#1837] Unhelpful error on CROSS JOIN UNNEST. @joocer

[0.16.0] - 2024-07-20

Added

[#1802] Table-level permissions @joocer
[#1803] Push date filters to Parquet reader @joocer
[#1784] Support for Tarchia catalog @joocer

Fixed

[#1806] Linux build failures. @joocer

[0.15.9] - 2024-06-29

Added

[#1778] Specialized INNER JOIN operator for INTEGERS and VARCHARS @joocer
[#1768] Date calculation predicate rewriter (optimizer) @joocer

Fixed

[#1788] Errors reading files reported as SQL syntax errors. @joocer

[0.15.8] - 2024-06-29

Added

[#1763] Support @? JSON Accessor @joocer
[#1670] Inital Tarchia catalog support @joocer

Fixed

[#1700] Mac versions not building and publishing to PyPI @joocer

[0.15.5] - 2024-06-18

Changed

[#1759] -> and ->> operators able to parse JSON strings and bytes @joocer

[0.15.4] - 2024-06-17

Added

[#1746] Support legacy Mabel LZMA compressed JSONL files @joocer
[#1748] S3/MinIO Connector supports async reads @joocer

Changed

[#1714] [ClickBench] GROUP BY literals @joocer
[#1696] Prevent predicates being pushed past limits @joocer
[#1753] Error on SELECT TOP syntax @joocer

Fixed

[#1756] Sort incorrectly returning zero records when first morsel is empty @joocer

[0.15.3] - 2024-06-06

Changed

[#1715] More optimizations AND, OR and XOR aware @joocer
[#1717] Memcahed notified when resource read from local buffers @joocer

[0.15.1] - 2024-05-31

Added

[#1685] [ClickBench] Include BLOB as column types for predicate pushdowns @joocer
[#1697] [ClickBench] Optimizer removes redundant operators @joocer
[#1698] [ClickBench] Replace LIKE conditions with STARTS_WITH and ENDS_WITH functions @joocer
[#1581] ANY operator supports literal lists @joocer
[#1690] Support REGEXP_REPLACE function @joocer

[0.15.0] - 2024-05-26

Added

[#723] Initial MATCH AGAINST support. @joocer
[#346] Permissions Model. @joocer
[#1582] Lemmatize full text searches @joocer
[#1584] Additional Statistics @joocer
[#1590] Push filters into sub queries @joocer
[#1588] Push filters into UNNEST @joocer
[#1613] Parallelize reads of GCS storage @joocer
[#1652] BLOB and VARCHAR values can be compared @joocer
[#1666] Heap Sort fused operator (LIMIT and ORDER BY implemented) @joocer
[#1665] Smart buffer size allocations @joocer
[#1676] b-prefixed blob literals @joocer
[#1586] Initial VIEWs functionality @joocer

Changed

[#1550] Additional Statistics @joocer
[#731] Buffer Pool Statistics @joocer
[#1604] Internal 'Node' object performance @joocer
[#1643] Mabel Planning Improvements @joocer

Fixed

[#1587] Filtering on CROSS JOIN UNNEST columns pushed too far. @joocer
[#1592] Prevent RANDOM being evaluated once in optimizer. @joocer
[#1598] Buffer Pool inefficiencies. @joocer
[#1620] Failure to set in Memcache shouldn't be fatal. @joocer
[#1622] Incorrect handling of nulls in JOIN conditions. @joocer
[#1664] Memory Pool stats not correct @joocer
[#1674] Errors with large chunks @joocer

[0.14.1] - 2024-04-13

Added

[#1571] Initial COSINE_SIMILARITY support (ad hoc). @joocer

Fixed

[#1573] Parquet COUNT(*) returns unfiltered row count. @joocer

[0.14.0] - 2024-04-07

Added

[#1460] Support PyArrow's IPC formated files. @joocer
[#1464] LIKE with no wildcards rewritten as = @joocer
[#1468] Python 3.12 support @joocer
[#1459] Primitive support for EXECUTE queries @joocer
[#1479] Unary (IS TRUE) operators pushed to SQL sources @joocer
[#540] Initial support for JSON Accessors -> and ->>. @joocer
[#1499] Error when EXECUTE USING syntax is attempted. @joocer
[#1491] New test dataset $missions. @joocer
[#1300] Support named Parameters. @joocer
[#1516] Day of Week placeholders for temporal queries. @joocer
[#1563] Add support for Cassandra-based sources@joocer

Changed

[#1448] Improved error message on GROUP BY errors. @joocer
[#1447] Levenshtein implementation rewritten. @joocer
[#1451] Improved IP containment testing performance. @joocer
[#1410] + [#190] Rewrite INNER JOIN @joocer
[#1481] Improvements to DATE +/- INTERVAL performance. @joocer
[#1486] Implement bespoke INTERVAL operators @joocer
[#1535] Native implementation of LEFT JOIN @joocer
[#1547] Buffer Pool implemented using a Memory Pool @joocer

Fixed

[#1462] Unhandled exception on empty datasets. @joocer
[#1465] Clashing column names not aliased correctly. @joocer
[#1445] SHOW EXTENDED COLUMNS not working as expected - changes made to data profile format. @joocer
[#1473] INNER JOIN statistics not correct. @joocer
[#1474] Unhelpful error message when SELECT * is mixed with column references @joocer
[#1487] Filters are not applied to scans on specific conditions involving JOINs @joocer
[#1480] INTERVAL cannot be compared to durations @joocer
[#1485] Improved handling of COUNT(*) when pushing to parquet and SQL @joocer
[#1513] LIST_CONTAINS_ANY performance @joocer

[0.13.3] - 2024-02-13

Added

[#1427] Boolean simplification optimization strategy. @joocer

Fixed

[#1413] Regression test failures on Windows. @joocer
[#1320] Errors due to binding memoization. @joocer
[#1433] Bandit tests misconfigured. @joocer

Changes

[#1421] Improve CROSS JOIN UNNEST performance. @joocer
[#1410] Whilst working on JOIN improvements, improvements to DISTINCT made. @joocer
Refactored SQL Fuzzer regression test script. @joocer

[0.13.0] - 2024-02-03

Added

[#1404] Optimize single element literal IN conditions @joocer

Fixed

[#1402] Force consistent blob read order. @joocer
[#1391] Rolling Log doesn't truncate records. @joocer
[#1395] Incorrect error raised on type errors on JOINs. @joocer
[#1397] Subscript SPLIT results. @joocer
[#1379] Unable to run on ARM. @joocer
[#1183] JOIN on literals fails. @joocer

Changed

[#1374] GCS access improvements (up to 2.5x faster IO). @joocer
[#1393] Performance Tuning blob reading. @joocer
[#1411] Updated sqlparser-rs to version 0.43.1 dependabot

[0.12.2] - 2024-01-10

Added

[#1355] Shortcut OR evaluations @joocer
[#1363] Shortcut nested AND evaluations @joocer
[#21] Support UNION statements @joocer
[#1354] New Optimization: Constant Expression Evaluations @joocer

Fixed

[#1370] Compare DATE and TIMESTAMP. @joocer

Changed

[#1366] Subscript refactor for performance (~3x faster). @joocer

[0.12.0] - 2024-01-02

Fixed

[#1080] Windows regression test failures. @joocer
Soundex incorrectly evaluated empty strings as '0000'. @joocer

Changed

[#1083] Simplify the handling of Query Statistics. @joocer
[#1086] Update Pythonize to v0.19 and py03 to v0.20.0 dependabot
[#1042] Create a generic base connector and use for all access @joocer
[#1032] Introduce a query binder @joocer
[#1158] Resync sqloxide @joocer

Added

[#1117] Cockroach Labs regression tests @joocer
[#1128] Specialized handlers for IS NOT TRUE and IS NOT FALSE @joocer
DATES SINCE temporal filter syntax added. @joocer
[#1145] Debug Logging @joocer
[#1156] Bitwise operators and Hex literals @joocer
[#1141] BiQuery regression tests and documentation @joocer
[#1171] Support NATURAL JOIN syntax @joocer
[#1171] Support SEMI and ANTI join syntax @joocer
[#1219] Extended FAKE syntax @joocer
[#1219] DISTINCT ON syntax added @joocer
[#1339] Updated sqlparser-rs to version 0.41.0 dependabot
[#1329] Add Redis as remote read cache option. @joocer
[#1337] Support RLIKE @joocer
[#1344] Initial Support for ANY and ALL array containment syntax @joocer

Removed

Python 3.8 is no longer supported. @joocer

[0.11.0] - 2023-06-16

Fixed

[#1069] Minor improvements identified during code review of code to generate numeric series. @joocer
[#1072] Minor improvements identified during code review of code to handle dates and intervals. @joocer
[#1026] Removed pin to version 0.11 of PyArrow dependabot
[#1077] Removed pin to version 0.7.1 of DuckDB dependabot

Changed

[#808] Rewrite of AST to Logical plan. @joocer
[#1031] .to_df deprecation complete. @joocer
[#356] Prepositioning changes for extended types. @joocer
[#1046] Updated sqlparser-rs to version 0.34.0 dependabot
[#1017] Fuzzy matching for suggestions is punctuation insensitive @joocer
[#1060] Conditional test execution made more explicit. @joocer
[#1026] Timeout FireStore connection. @joocer

Added

[#1034] Schemas added for the internal sample datasets. @joocer
[#1038] Able to pass SqlAlchemy Engine to the SQL Connectors, allowing for more complex authentication scenarios. @joocer
[#1065] Support integer division operator DIV. @joocer

[0.10.0] - 2023-05-03

Warnings

.to_df() will be replaced with .pandas() in version 0.11.

Fixed

[#929] Improved error messages for malformed temporal clauses. @joocer
[#735] (correction) Cursor fetchone and fetchmany step over the record set. @joocer
[#994] LIMIT didn't prevent additional files from being read after limit was met. @joocer
[#996] Performance issues with LIMIT and serialization steps. @joocer
[#1008] JOIN on a literal fails when attempting to find good match. @joocer
[#1006] Errors handling filenames with multiple dots in the name. @joocer
[#1010] Predicates not pushed for ZSTD compressed files. @joocer
[#1007] Wildcards not interpretted correctly in some projection pushdowns @joocer
[#1015] Column comparisons not working as expected in predicate pushdowns @joocer

Changed

[#925] Updated sqlparser-rs to version 0.31.0 dependabot
[#931] Cursor fetch no longer accept asdict parameter, instead each tuple (an orso Row) has an as_dict method @joocer
[#906] Cursor extends an orso DataFrame, providing additional functionality @joocer
[#938] Updated sqlparser-rs to version 0.32.0 dependabot
[#940] CityHash moved to orso.cityhash @joocer
[#942] Profiler (and distogram) moved to orso @joocer
[#965] (MySQL Compatibility) SHOW STORES renamed to SHOW DATABASES @joocer
[#973] Improved readability of Sort nodes in EXPLAIN queries @joocer
[#984] Updated sqlparser-rs to version 0.33.0 dependabot
[#999] Improved error messages when using subscript functions @joocer
[#1019] CircularLog renamed RollingLog @joocer

Added

[#952] Implement statement-based permissions model @joocer
[#951] Initial Support for Prepared Statements (EXECUTE queries) @joocer
[#958] Log of recent queries @joocer
[#905] CLI includes REPL mode @joocer
[#942] CLI has option to output to Markdown format @joocer
[#967] Initial Information Schema capability @joocer
[#978] Initial support for USE queries @joocer
[#989] REPL supports limited dot commands .help and .exit @joocer
[#991] Added SPLIT function @joocer
[#969] New functions supporting Power BI integration @joocer
[#999] STRUCT casting functions @joocer
[#1003] DuckDB compatibility tests @joocer
[#1002] Limit function ignored @joocer

[0.9.3] - 2023-03-04

Fixed

[#916] Profile error on morsel with all nulls in column @joocer
Correctness of LRU-K algorithm @joocer
[#917] Comparisons failed on very long and skinny tables @joocer

[0.9.2] - 2023-02-28

Fixed

[#909] Divide by Zero error handling empty pages @joocer
[#912] Literal expressioned which evaluate to a boolean were ignored @joocer

Changed

[#901] Generate Series no longer accepts single numbers or IP ranges, provide explicit start or use | to test IP address containment @joocer
[#848] Collection and SQL Connectors dynamically size reads to fill target morsel size @joocer

[0.9.1] - 2023-02-23

Fixed

[#888] YAML parser errors. @joocer

[0.9.0] - 2023-02-19

Fixed

[#797] Name collisons with aliases cause issues in ORDER BY. @joocer
[#833] Unhelpful error when no statement is provided @joocer
[#870] Repeated columns in GROUP BY not processed @joocer
[#873] 2 x CodeQL security issues @joocer

Changed

[#799] Chunk large blob reads. @joocer
[#812] Abstract the tree structure that plans are built from. @joocer
[#808] Split Logical and Physical planning (partial). @joocer
[#825] Remove HyperLogLog from profiling. @joocer
[#750] More CLI improvements. @joocer
[#589] Moved conditional imports out of program initialization @joocer
[#836] Use PyArrow 11s exposure of underlying date values in profiler @joocer
[#853] CaskDB replaces RocksDB as default KV store @joocer
[#855] Caches have been renamed and separated from KV Stores to disencourage incorrect use; The Memcache Cache is now imported using from opteryx.managers.cache import MemcachedCache @joocer
[#857] Removed PyYAML install @joocer
[#865] Replaced third-party DATE_TRUNC implementation with a first-party implementation @joocer
[#861] Replaced third-party bitarray library with a first-party implementation @joocer
[#871] Consistently name internal variables relating to chunks of data to 'morsel' (technically breaking, but no user impact expected) @joocer
[#880] Minor performance improvements @joocer

Added

[#801] New helper function opteryx.query(). @joocer
[#818] Save query plans to disk (partial). @joocer
[#163] Initial support for SQL databases as a data source. @joocer
[#844] Materialize results as a Polars dataframe. @joocer
[#869] Introduce a SQL Fuzzer. @joocer
[#877] Initial experimental implementation of internal KV database, HadroDB @joocer

[0.8.3] - 2023-01-10

Fixed

[#782] Support literal predicates in JOIN conditions. @joocer

Changed

[#789] Updated sqlparser-rs to version 0.30.0 dependabot

Added

[#521] Query files directly. @joocer
[#786] Save dataset as pandas DataFrame. @joocer
[#787] Run queries against pandas DataFrames. @joocer

[0.8.2] - 2023-01-06

Fixed

[#757] Multiple bugs in config manager. @joocer
[#769] ARRAY_AGG couldn't be nested. @joocer
[#775] Connection function .arrow() materializes before applying limit. @joocer

Changed

Internal refactoring relating to creation of metadata service. @joocer
[#761] Updated sqlparser-rs to version 0.29.0 dependabot

Added

[#750] CLI improvements. @joocer
[#758] Support AVRO and TSV formatted files. @joocer

[0.8.1] - 2022-12-30

Fixed

[#754] Occassional segfaults on Aggregates. @joocer

[0.8.0] - 2022-12-27

Fixed

[#703] ORDER BY columns not in SELECT clause. @joocer
[#712] Aggregates on literals when combined with a GROUP BY clause. @joocer
[#710] SEARCH mishandles pages with empty values in first row. @joocer
[#711] DATE_TRUNC is case sensitive. @joocer

Changed

[#707] First try to estimate unique values using the Distogram for SHOW EXTENDED COLUMNS. @joocer
[#707] SHOW EXTENDED COLUMNS creates histograms of 20 bins. @joocer
[#707] Distogram (data profiler) significant performance improvements. @joocer
[#722] Allow temporal FOR after alias AS clauses. @joocer
[#743] 'Did you mean' prompt for columns better suggestions when casing is different. @joocer

Added

[#515] Implement various new functions. @joocer
[#19] Initial support for CTE expressions. @joocer
[#204] Initial support predicate pushdowns. @joocer
[#721] Improved temporal range error messages. @joocer

[0.7.0] - 2022-12-02

Fixed

[#653] LIKE and FOR clauses cannot coexist in SHOW queries. @joocer
[#669] COUNT(*) cannot be mixed with other aggregates. @joocer
[#518] SELECT * and GROUP BY can't be used together. @joocer
[#689] IS comparisons cannot be combined with other comparisons when optimization is off. @joocer

Changed

[#662] Updated sqlparser-rs to version 0.27.0 dependabot

Added

[#629] Optimizer pre-evaluates constant expressions. @joocer
[#439] Support SHOW STORES. @joocer
[#542] Support POSITION. @joocer
[#22] Support CASE statements. @joocer
[#665] Partial support of ARRAY_AGG function. @joocer
[#668] Optimizer exchanges functions with constant results. @joocer
[#300] Support advanced TRIM syntax. @joocer
[#570] Optimizer implements De Morgan's Law. @joocer

[0.6.0] - 2022-11-08

Fixed

[#568] Unable to perform aggregates on literals. @joocer
[#592] Dates not always handled correctly. @joocer
[#600] Parameterization when used on query batches fails. @joocer
[#580] Empty result sets have no column information. @joocer
[#548] 'did you mean' message restored for dataset WITH hints. @joocer
[#640] COUNT(*) shortcut only used when in uppercase. @joocer
[#645] (correction) null values not handled correctly in comparisions. @joocer
Problem installing on M1 Mac. @joocer
Support AND, OR, and XOR in SELECT statement. @joocer
[#646] Temporal clauses in incorrect place were ignored @joocer

Changed

[#566] Change from using SQLite3 to DuckDB for SQL comparision tests in Wrenchy-Bench. @joocer
[#584] (clarity) enable_page_management configuration and parameter renamed enable_page_defragmentation with some minor refactoring of approach to defragmentation. @joocer
(alignment) TIMESTAMP casting no longer supports casting from a number. @joocer
[#588] Integrate sqloxide into Opteryx to reduce lag with sqlparser-rs updates. @joocer
[#619] Page defragmentation moved to an Operator and positioned by the Optimizer. @joocer
(correction) cursor 'fetch*' methods return Python tuple, rather than Python lists. @joocer

Added

[#533] Support LIKE on SHOW FUNCTIONS, see sqlparser-rs/#620. @joocer
[#570] Query Optimizer rule to reduce steps in expression evaluation by partial elimination of negatives. @joocer
[#129] Support FOR clauses for all datasets. @joocer
[#543] Support 'type string' notation for casting values. @joocer
[#596] Optimizer replaces ORDER BY and LIMIT plan steps with a single 'HeapSort' plan step. @joocer
[#515] NULLIF function. @joocer
[#581] New SQL Battery test that tests results, and initial set of tests. @joocer
[#577] Hierarchical buffer pool and configuration. @joocer

[0.5.0] - 2022-10-02

Fixed

[#528] .shape() and .count() not working as expected. @joocer
Numbers expressed in the form +n not parsed correctly. @joocer

Changed

(alignment) .as_arrow() renamed to .arrow() to align to DuckDB naming. @joocer
(consistency) SHOW COLUMNS returns the column name in the name column, previously column_name @joocer
(correction) cursor 'fetch*' methods returns tuples rather than dictionaries as defaults, this is correcting a bug in PEP249 compatibility. @joocer
[#517] (security) Placeholder changed from '%s' to '?'. @joocer
[#522] Implementation of LRU-K(2) for cache evictions. @joocer
[#537] Significant refactor of Query Planner. @joocer

Added

[#397] Time Travel with '$planets' dataset. @joocer
[#519] Introduce a size limit on .as_arrow(). @joocer
[#324] Support IN UNNEST(). @joocer
[#386] Support SET statements. @joocer
[#531] Support SHOW VARIABLES and SHOW PARAMETERS. @joocer
[#464] Support LEFT JOIN <relation> USING @joocer
[#402] INNER JOIN ON supports multiple conditions @joocer
[#551] Document stores (MongoDb + FireStore) return '_id' column holding string version of document ID. @joocer
[#532] Runtime parameters are able to be altered using the SET statement. @joocer
[#524] Query Optimizer - conjunctive predicate splitter. @joocer

[0.4.1] - 2022-09-12

Fixed

Fixed missing __init__ file. @joocer

[0.4.0] - 2022-09-12

Added

[#366] Implement 'function not found' suggestions. @joocer
[#443] Introduce a CLI. @joocer
[#351] Support SHOW FUNCTIONS. @joocer
[#442] Various functions. @joocer
[#483] Support SHOW CREATE TABLE. @joocer
[#375] Results to an Arrow Table. @joocer
[#486] Support functions on aggregates and aggregates on functions. @joocer
Initial support for INTERVALs. @joocer
[#395] Support reading CSV files. @joocer
[#498] CLI support writing CSV/JSONL/Parquet. @joocer

Changed

[#457] (correction) null values are removed before performing INNER JOIN USING. @joocer

Fixed

[#448] VERSION() failed and missing from regression suite. @joocer
[#404] COALESCE fails for NaN values. @joocer
[#453] PyArrow bug with long lists creating new columns. @joocer
[#444] Very low cardinality INNER JOINS exceed memory allocation. @joocer
[#459] Functions lose some detail on non-first page. @joocer
[#465] Pages aren't matched to schema for simple queries. @joocer
[#468] Parquet reader shows some fields as "item". @joocer
[#471] Column aliases not correctly applied when the relation has an alias. @joocer
[#489] Intermittent behaviour on hash JOIN algorithm. @joocer

[0.3.0] - 2022-08-28

Added

[#196] Partial implementation of projection pushdown (Parquet Only). @joocer
[#41] Enable the results of functions to be used as parameters for other functions. @joocer
[#42] Enable inline operations. @joocer
[#330] Support SIMILAR TO alias for RegEx match. @joocer
[#331] Support SAFE_CAST alias for TRY_CAST. @joocer
[#419] Various simple functions (SIGN, SQRT, TITLE, REVERSE). @joocer
[#364] Support SOUNDEX function. @joocer
[#401] Support SHA-based hash algorithm functions. @joocer

Changed

(alignment) Paths to storage adapters has been updated to reflect 'connector' terminology.
(sensible defaults) Default behaviour changed from Mabel partitioning to no partitioning.
(correction) - Use of aliases defined in the SELECT clause can no longer be used in WHERE and GROUP BY clauses - this is a correction to align to standard SQL behaviour.
(correction) - Use of 'None' as an alias for null is no longer supported - this is a correction to align to standard SQL behaviour.
[#326] Prefer pyarrow's 'promote' over manually handling missing fields. @joocer
[#39] Rewrite Aggregation Node to use Pyarrow group_by(). @joocer
[#338] Remove Evaluation Node. @joocer
[#58] Performance of ORDER BY RAND() improved. @joocer

Fixed

[#334] All lists should be cast to lists of strings. (@joocer)
[#382] INNER JOIN on UNNEST relation. (@joocer)
[#320] Can't execute functions on results of GROUP BY. (@joocer)
[#399] Strings in double quotes aren't parsed. (@joocer)

[0.2.0] - 2022-07-31

Added

[#232] Support DATEPART and EXTRACT date functions. @joocer
[#63] Estimate row counts when reading blobs. (@joocer)
[#231] Implement DATEDIFF function. (@joocer)
[#301] Optimizations for IS conditions. (@joocer)
[#229] Support TIME_BUCKET function. (@joocer)

Changed

[#35] Table scan planning done during query planning. @joocer
[#173] Data not found raises different errors under different scenarios. (@joocer)
Implementation of LEFT and RIGHT functions to reduce execution time. (@joocer)
[#258] Code release approach. (@joocer)
[#295] Removed redundant projection when SELECT *. (@joocer)
[#297] Filters on SHOW COLUMNS execute before profiling. (@joocer)

Fixed

[#252] Planner should gracefully convert byte strings to ascii strings. (@joocer)
[#184] Schema changes cause unexpected and unhelpful failures. (@joocer)
[#261] Read fails if buffer cache is unavailable. (@joocer)
[#277] Cache errors should be transparent. (@joocer)
[#285] DISTINCT on nulls throws error. (@joocer)
[#281] SELECT on empty aggregates reports missing columns. (@joocer)
[#312] Invalid dates in FOR clauses treated as TODAY. (@joocer)

[0.1.0] - 2022-07-02

Added

[#165] Support S3/MinIO data stores for blobs. (@joocer)
FAKE dataset constructor (part of #179). (@joocer)
[#177] Support SHOW FULL COLUMNS to read entire datasets rather than just the first blob. (@joocer)
[#194] Functions that are abbreviations, should have the full name as an alias. (@joocer)
[#201] generate_series() supports CIDR expansion. (@joocer)
[#175] Support WITH (NO_CACHE) hint to disable using cache. (@joocer)
[#203] When reporting that a column doesn't exist, it should suggest likely correct columns. (@joocer)
'Not' Regular Expression match operator, !~ added to supported set of operators. (@joocer)
[#226] Implement DATE_TRUNC function. (@joocer)
[#230] Allow addressing fields as numbers. (@joocer)
[#234] Implement SEARCH function. (@joocer)
[#237] Implement COALESCE function. (@joocer)

Changed

Blob-based readers (disk & GCS) moved from 'local' and 'network' paths to a new 'blob' path. (@joocer)
Query Execution rewritten. (@joocer)
[#20] Split query planner and query plan into different modules. (@joocer)
[#164] Split dataset reader into specific types. (@joocer)
Expression evaluation short-cuts execution when executing evaluations against an array of null. (@joocer)
[#244] Improve performance of IN test against literal lists. (@joocer)

Fixed

[#172] LIKE on non string column gives confusing error (@joocer)
[#179] Aggregate Node creates new metadata for each chunk (@joocer)
[#183] NOT doesn't display in plan correctly (@joocer)
[#182] Unable to evaluate valid filters (@joocer)
[#178] SHOW COLUMNS returns type OTHER when it can probably work out the type (@joocer)
[#128] JOIN fails, using PyArrow .join() (@joocer)
[#189] Explicit JOIN algorithm exceeds memory (@joocer)
[#199] SHOW EXTENDED COLUMNS blows memory allocations on large tables (@joocer)
[#169] Selection nodes in EXPLAIN have nested parentheses. (@joocer)
[#220] LIKE clause fails for columns that contain nulls. (@joocer)
[#222] Column of NULL detects as VARCHAR. (@joocer)
[#225] UNNEST does not assign a type to the column when all of the values are NULL. (@joocer)

[0.0.2] - 2022-06-03

Added

[#72] Configuration is now read from opteryx.yaml rather than the environment. (@joocer)
[#139] Gather statistics on planning reading of segements. (@joocer)
[#151] Implement SELECT table.*. (@joocer)
[#137] GENERATE_SERIES function. (@joocer)

Fixed

[#106] ORDER BY on qualified fields fails (@joocer)
[#103] ORDER BY after JOIN errors (@joocer)
[#110] SubQueries AS statement ignored (@joocer)
[#112] SHOW COLUMNS doesn't work for non sample datasets (@joocer)
[#113] Sample data has "NaN" as a string, rather than the value NaN (@joocer)
[#111] CROSS JOIN UNNEST should return a NONE when the list is empty (or NONE) (@joocer)
[#119] 'NoneType' object is not iterable error on UNNEST (@joocer)
[#127] Reading from segments appears to only read the first segment (@joocer)
[#132] Multiprocessing regressed Caching functionality (@joocer)
[#140] Appears to have read both frames rather than the latest frame (@joocer)
[#144] Multiple JOINS in one query aren't recognized (@joocer)

[0.0.1] - 2022-05-09

Added

Additional statistics recording the time taken to scan partitions (@joocer)
Support for FULL JOIN and RIGHT JOIN (@joocer)

Changed

Use PyArrow implementation for INNER JOIN and LEFT JOIN (@joocer)

Fixed

[#99] Grouping by a list gives an unhelpful error message (@joocer)
[#100] Projection ignores field qualifications (@joocer)

[0.0.0]

Initial Version