Why LLMs Struggle with Arithmetic Puzzles Post date August 23, 2025 Post author By Extrapolate Post categories In data-synthesis-pipeline, fine-tuning-on-synthetic-data, mathematical-reasoning-ai, out-of-domain-benchmarking, reasoning-verification, symbolic-reasoning-ai, synthetic-data-generation, zero-shot-learning
Testing Large Language Models on Math Puzzles Post date August 23, 2025 Post author By Extrapolate Post categories In data-synthesis-pipeline, fine-tuning-on-synthetic-data, mathematical-reasoning-ai, out-of-domain-benchmarking, reasoning-verification, symbolic-reasoning-ai, synthetic-data-generation, zero-shot-learning
Evaluating Fine-Tuned LLMs on Reasoning Puzzles Post date August 23, 2025 Post author By Extrapolate Post categories In data-synthesis-pipeline, fine-tuning-on-synthetic-data, mathematical-reasoning-ai, out-of-domain-benchmarking, reasoning-verification, symbolic-reasoning-ai, synthetic-data-generation, zero-shot-learning
A Framework for Synthesizing Arithmetical Puzzle Datasets for Large Language Models Post date August 23, 2025 Post author By Extrapolate Post categories In data-synthesis-pipeline, fine-tuning-on-synthetic-data, mathematical-reasoning-ai, out-of-domain-benchmarking, reasoning-verification, symbolic-reasoning-ai, synthetic-data-generation, zero-shot-learning
How LLMs Learn to Solve Complex Math Post date August 23, 2025 Post author By Extrapolate Post categories In data-synthesis-pipeline, fine-tuning-on-synthetic-data, hackernoon-top-story, mathematical-reasoning-ai, out-of-domain-benchmarking, reasoning-verification, synthetic-data-generation, zero-shot-learning
Finding XPath Bugs in XML Document Processors: Testing XPath Functionality and Other Related Work Post date March 12, 2025 Post author By XPath Post categories In object-relational-mapping, synthetic-data-generation, test-case-generation, xml-document-processors, xml-documents, xpath-bugs, xpath-functionality, xquery
Everyone in AI Loves Synthetic Data—But No One Can Agree on What It Is Post date March 9, 2025 Post author By Marc Ryan Post categories In data, data-analytics, data-insights, generative-ai, hackernoon-top-story, imputation, synthetic-data, synthetic-data-generation
Improving Text Embeddings with Large Language Models: Training Post date March 1, 2025 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Instructions for Training and Evaluation Post date October 10, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Prompts for Synthetic Data Generation Post date October 10, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Implementation Details Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Conclusion and References Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Analysis of Training Hyperparameters Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Is Contrastive Pre-training Necessary? Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Multilingual Retrieval Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Main Results Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Model Fine-tuning and Evaluation Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Statistics of the Synthetic Data Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings
Improving Text Embeddings with Large Language Models: Synthetic Data Generation Post date October 9, 2024 Post author By Auto Encoder: How to Ignore the Signal Noise Post categories In ai-for-information-retrieval, beir-benchmark, contrastive-pre-training, language-models, multilingual-ai, natural-language-processing, synthetic-data-generation, text-embeddings