Writer

Enterprise

company

Verified

https://writer.com/

Get_Writer

writer

Activity Feed

AI & ML interests

AGI, LLMs, Knowledge Graph, Palmyra, Domain Specific LLM

Recent Activity

melisa authored a paper about 1 month ago

Expect the Unexpected: FailSafe Long Context QA for Finance

melisa updated a model about 2 months ago

Writer/colab

dmytro-writer authored a paper 2 months ago

Comparative Analysis of Retrieval Systems in the Real World

View all activity

Writer's activity

kiranr

updated a model 21 days ago

Writer/Palmyra-local-1_7B

Updated 21 days ago • 11 • 1

wassemgtk

posted an update 23 days ago

Post

2805

I’ve been diving into the iRoPE architecture from Llama 4—a game-changer for long-context models! It interleaves local attention (with RoPE) for short contexts and global attention (with inference-time temp scaling) for long-range reasoning, aiming for infinite context. I’m going to try writing iRoPE—who wants to help?

Code: https://github.com/wassemgtk/iRoPE-try/blob/main/iRoPE.ipynb

1 reply

melisa

updated a model 27 days ago

Writer/colab

Updated 27 days ago • 1

wassemgtk

posted an update about 1 month ago

Post

2083

For fun, a new project: SuperTokenizer! A BPE tokenizer trained on C4 to beat GPT-4. Byte-level, A100-powered, and open-source. Messing around with tokens!
https://github.com/wassemgtk/SuperTokenizer

1 reply

melisa

authored a paper about 1 month ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 131

wassemgtk

posted an update 2 months ago

Post

1880

# GESAL: Real-Time Adaptation for LLMs

We’re excited to unveil **Graph-Enhanced Singular Adaptive Learning (GESAL)**, a framework that lets LLMs like meta-llama/Llama-3.2-1B adapt in real time using user feedback. Check out the code and white paper on GitHub!

🔗 **Code**: [https://github.com/writer/AI-Adaptive-Learning-GESAL](https://github.com/writer/AI-Adaptive-Learning-GESAL)

---

## Why GESAL?

Static LLMs struggle to adapt without heavy retraining. GESAL solves this with:
- **SVF**: Adapts weights via \( W' = U (\Sigma \cdot z) V^T \), using few parameters.
- **Graph Memory**: Stores adaptations in nodes for scalability.
- **RL**: Updates via \( J(z) = \mathbb{E}[\log \pi_z(y|x) r] \) based on feedback.

---

## How It Works

Ask "How many R’s in ‘strawberry’?" If it says "2" and you say "no," GESAL learns to say "3" next time, avoiding repeats.

---

## Try It

Built with Hugging Face’s transformers:

pip install transformers torch numpy
python Adaptive_Learning_(GESAL).py

Needs a Hugging Face token for Llama-3.2-1B.

---

## Results

GESAL hits 95% accuracy after 5 feedbacks vs. LoRA’s 70%. It’s efficient (~0.5M params) and scalable.

15 replies

wassemgtk

authored a paper 2 months ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 131

dmytro-writer

authored 2 papers 2 months ago

Comparative Analysis of Retrieval Systems in the Real World

Paper • 2405.02048 • Published May 3, 2024

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 131

kiranr

authored a paper 3 months ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 131

muayad

authored a paper 3 months ago

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published Feb 10 • 131

samjulien

posted an update 5 months ago

Post

1536

🔥 RAG in just a few lines of code?!

Try out our Hacker News Listener with new built-in RAG capabilities and Palmyra X 004 from the team at Writer!

This Writer Framework app:

- Scrapes up to 500 HN stories and comments
- Uploads them to a Knowledge Graph
- Enables interactive chat with the content using graph-based RAG
- Provides source attribution with every response

The best part? Setting up RAG is now incredibly simple - just a few lines of code to connect your Knowledge Graph as a tool with Palmyra X 004.

🤗 Space: samjulien/hacker-news-listener
💻 Code: https://github.com/writer/framework-tutorials/tree/main/hacker-news-social-listener

melisa

posted an update 8 months ago

Post

3124

🔥 Introducing "Writing in the Margins (WiM)" - better inference pattern for long context LLMs that solves the Lost-in-the-Middle problem 🔥

Paper page: Writing in the Margins: Better Inference Pattern for Long Context Retrieval (2408.14906)

TL;DR
Make your model write "margin notes" as you chunk prefill the KV cache. Then ask it reread all notes before it speaks up.
Works with humans, works with AI 🤖

WiM leverages the chunked prefill of the key-value cache, which concurrently generates query-based extractive summaries at each step of the prefill that are subsequently reintegrated at the end of the computation. We term these intermediate outputs “margins”, drawing inspiration from the practice of making margin notes for improved comprehension of long contexts in human reading. We show that this technique, which adds only minimal additional computation, significantly improves LLMs long context reasoning capabilities.

Think: Every chunk has a chance to be attended to/ be at the end of the context at least once. 🎉

📊 Results:
- An average accuracy boost of 7.5% in multi-hop reasoning tasks like HotpotQA and MultiHop-RAG.
- Even a 30% increase in F1-score for summarisation-like tasks (CWE).

Plus, WiM fits seamlessly into interactive applications (think: progress bar!). It can provide real-time progress updates during data retrieval and integration, making it user-friendly and transparent - a stark contrast to feeding 1mln tokens to an LLMs and waiting 6 min for the first token. 🤯

👩‍💻🧑‍💻 Check it out and contribute to our open-source project here: https://github.com/writer/writing-in-the-margins

🧠 More about chunked prefill: https://docs.vllm.ai/en/latest/models/performance.html#chunked-prefill

2 replies

melisa

authored a paper 8 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 143

wassemgtk

authored a paper 8 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 143

kiranr

authored a paper 8 months ago

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 143

melisa

authored 2 papers 8 months ago

OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web

Paper • 2402.17553 • Published Feb 27, 2024 • 26

Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning

Paper • 2307.03692 • Published Jul 5, 2023 • 26

samjulien

posted an update 9 months ago

Post

1971

🔥 Today, Writer dropped Palmyra-Med-70b and Palmyra-Fin-70b, two new domain-specific models that are setting a new standard for medical and financial model performance.

TL;DR
Palmyra-Med-70b
🔢 8k and 32k versions available
🚀 MMLU performance of ~86%, outperforming other top models
👨‍⚕️ Great for diagnosing, planning treatments, medical research, insurance coding and billing
📃 Open-model license for non-commercial use cases
🤗 Available on Hugging Face: Writer/Palmyra-Med-70B
💾 Live on NVIDIA NIM: https://build.nvidia.com/writer/palmyra-med-70b

Palmyra-Fin-70b
🚀 Passed the CFA Level III exam with a 73% score — the first model to do so
💸 Skilled at complex tasks like investment research, financial analysis, and sentiment analysis
📈 Outperformed other top models on a long-fin-eval test of real-world use cases
📃 Open-model license for non-commercial use cases
🤗 Available on Hugging Face: https://huggingface.co/Writer/Palmyra-Fin-70B-32K
💾 Live on NVIDIA NIM: https://build.nvidia.com/writer/palmyra-fin-70b-32k

Try them out and let us know what you think!

2 replies

wassemgtk

authored a paper 9 months ago

Comparative Analysis of Retrieval Systems in the Real World

Paper • 2405.02048 • Published May 3, 2024

AI & ML interests

Recent Activity

Team members 123

Writer's activity