The Donald Trump: New top story on Hacker News: Show HN: Musoq – Query Anything with SQL Syntax (Git, C#, CSV, Can DBC)

Wednesday, 18 December 2024

New top story on Hacker News: Show HN: Musoq – Query Anything with SQL Syntax (Git, C#, CSV, Can DBC)

Show HN: Musoq – Query Anything with SQL Syntax (Git, C#, CSV, Can DBC)
5 by Puchaczov | 0 comments on Hacker News.
Hey, For those of you who don't know my little tool Musoq, I wanted to introduce it as a small tool that allows you to query with SQL-like syntax without any database. It allows you to query various things from niche ones like CAN DBC files, weird ones like C# code, interesting ones with Git querying to regular stuff like CSV, TSV and various others. I am quite a bit experimenting with various things so I'm hybridizing the engine with LLMs or doing other weird stuff that are more or less practical :-) I wanted also to share some recent developments in this little project as I hope it might be interesting to some of you. New Experimental Plugins: * Git Plugin (Beta) : I've been working on Git repository querying - managed to test it on the EF Core repo (16k commits) and it seems to work okay * Roslyn Plugin (Beta) : Added basic C# code analysis capabilities For the very first time: I've extended CROSS APPLY to use computed results as arguments! Now the operator can use values from the current row as inputs. Here's an example: SELECT f.DirectoryName, f.FileName FROM #os.directories('/some/path', false) d CROSS APPLY #os.files(d.FullName, true) f WHERE d.Name IN ('Folder1', 'Folder2') After another pack of fixes I'm finally able to query multiple git repositories AT ONCE! with ProjectsToAnalyze as ( select dir2.FullName as FullName from #os.directories('D:\repos', false) dir1 cross apply #os.directories(dir1.FullName, false) dir2 where dir2.Name = '.git' ) select c.Message, c.Author, c.CommittedWhen from ProjectsToAnalyze p cross apply #git.repository(p.FullName) r cross apply r.Commits c where c.AuthorEmail = 'my-email@email.ok' order by c.CommittedWhen desc Under the Hood: - Added a Buckets feature for memory management (currently just testing it with the Roslyn plugin) - Moved to .NET 8 - Added CROSS/OUTER APPLY operators - Made some improvements to error messages and runtime behavior New piping features: I've been experimenting with piping capabilities: * Image Analysis with LLMs : ./Musoq.exe image encode "image.jpg" | ./Musoq.exe run query "select s.Shop, s.ProductName, s.Price from ..." * Text Data Extraction : Get-Content "ticket.txt" | ./Musoq.exe run query "select t.TicketNumber, t.CustomerName ... from #stdin.text('Ollama', 'llama3.1') t" * Data Source Combination : { docker image ls; ./Musoq.exe separator; docker container ls } | ./Musoq.exe run query "..." I'm working on comprehensive documentation: I encourage you especially to look at section "Practical Examples and Applications" and "Data Sources" where you can look at all the tables the tool currently provides. < https://puchaczov.github.io/Musoq/ > Other Changes: - Made some improvements to OS and Archive data sources (OS can now query metadata like EXIF) - Added a few fields to CAN DBC plugin - Command outputs can now be used as inputs for queries I'm hoping to: - Improve stability and add more tests - Flesh out the documentation - Work on package distribution (Scoop, Ubuntu packages) - Share some examples of source code querying with Roslyn Ideas for later: - WHERE robust analysis and optimizations - DISTINCT operator implementation - PROTOBUF schema support - Performance improvements - Query parallelization - Recursive CTEs - Subqueries I'd really appreciate any thoughts or feedback! The documentation section where I write a short analysis of EF Core with git plugin: < https://puchaczov.github.io/Musoq/practical-examples-and-app... >

No comments:

Post a Comment