Nicolay Rusnachenko's picture

Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information Retrieval・Medical Multimodal NLP (πŸ–Ό+πŸ“) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

Organizations

None yet

Posts 36

view post
Post
118
πŸ“’ If you're aimed at quick experiment with LLM and known Chain-of-Thought (CoT) / prompt schema with no-string dependencies, then I have something relevant for you to share πŸ’Ž

So far I released the updated version πŸ“¦ bulk-chain-0.25.0 πŸ“¦, which is aimed at bringing accessible API for an instant LLM application towards massive data iterators using via predefined prompt schema 🎊

πŸ“¦: https://pypi.org/project/bulk-chain/0.25.0/
🌟: https://github.com/nicolay-r/bulk-chain
πŸ“˜: https://github.com/nicolay-r/bulk-chain/issues/26

The key updates of the most recent release are:
βœ… πŸͺΆ No-string (empty dependencies): you can use any framework / API for LLM.
βœ… 🐍 Python API support (see first screenshot πŸ“Έ).
βœ… πŸ’₯ Native try-catch wrapping to guarantee no-data-lost on using remote providers especially: OpenAI, ReplicateIO, OpenRouter, etc.
βœ… πŸ”₯ Batching mode support: you may wrap for handling batches to significantly boost the performance πŸš€ (see screenshot below for bath enabling πŸ“Έ)
βœ… πŸ”§ Fixed a lot of minor bugs

Quick start on GoogleColab:
πŸ“™: https://colab.research.google.com/github/nicolay-r/bulk-chain/blob/master/bulk_chain_tutorial.ipynb

πŸ“˜ The wiki of the project is available here:
https://github.com/nicolay-r/bulk-chain/wiki/Project-Documentation
view post
Post
2148
πŸ“’ So far I noticed that 🧠 reasoning with llm πŸ€– in English is tend to be more accurate than in other languages.
However, besides the GoogleTrans and other open transparent translators, I could not find one that could be easy to use solutions to avoid:
1.πŸ”΄ Third-party framework installation
2.πŸ”΄ Text chunking
3.πŸ”΄ support of meta-annotation like spans / objects / etc.

πŸ’Ž To cope problem of IR from non-english texts, I am happy to share the bulk-translate 0.25.0. 🎊

⭐ https://github.com/nicolay-r/bulk-translate

bulk-translate is a tiny Python 🐍 no-string framework that allows translate series of texts with the pre-annotated fixed-spans that are invariant for translator.

It supports πŸ‘¨β€πŸ’» API for quick data translation with (optionaly) annotated objects in texts (see figure below) in Python 🐍
I make it accessible as much as possible for RAG and / or LLM-powered app downstreams:
πŸ“˜ https://github.com/nicolay-r/bulk-translate/wiki

All you have to do is to provide iterator of texts, where each text:
1. βœ… String object
2. βœ… List of strings and nested lists that represent spans (value + any ID data).

πŸ€– By default I provide a wrapper over googletrans which you can override with your own πŸ”₯
https://github.com/nicolay-r/bulk-translate/blob/master/models/googletrans_310a.py

datasets

None public yet