Skip to content

Document 09: Novelty Angle & Publication Strategy

Overview

This document outlines the novel contributions of Indonesia-MTEB, positions it relative to existing benchmarks, and provides a comprehensive publication strategy for top-tier NLP venues. It addresses the critical question: "What is new and why does it matter?" from the perspective of reviewers, program committees, and the broader NLP community.


1. Core Novelty Contributions

1.1 Primary Novelty Claims

Novelty Dimension Indonesia-MTEB Contribution Differentiation
Language Coverage First comprehensive Indonesian text embedding benchmark covering all 8 MTEB task categories Existing resources (IndoNLU, NusaCrowd) focus on classification/generation; no embedding benchmark exists
Regional Language Integration Incorporates Javanese, Sundanese, Malay, and regional code-mixing evaluation Regional MTEBs (VN-MTEB, TR-MTEB) are monolingual; SEA-BED covers 10 languages but only 71% human-curated
Cultural Preservation Framework Novel evaluation framework for Indonesian cultural terms, register detection, and code-mixing No existing benchmark evaluates cultural term preservation in translation
3-Pronged Data Strategy Combines aggregation (50+ existing), translation (full MTEB), AI generation (novel tasks) Other benchmarks primarily use translation only
Kept Ratio Analysis First systematic analysis of EN-ID translation quality by task type with empirical thresholds VN-MTEB reports kept ratios but lacks linguistic proximity analysis

1.2 Novel Technical Contributions

1.2.1 Cultural Term Preservation Validation

# Novel evaluation component: Cultural term preservation
INDONESIAN_CULTURAL_TERMS = {
    # Social concepts
    "gotong royong", "pancasila", "rukun", "siskamling",
    # Religious/cultural
    "lebaran", "puasa", "halal bil halal", "nyepi", "waisak",
    # Culinary
    "warung", "nasi goreng", "rendang", "sate", "bakso",
    # Arts/crafts
    "batik", "wayang", "gamelan", "keris", "ikat",
    # Geographic/identity
    "merantau", "kampung", "desa", "kos"
}

def evaluate_cultural_preservation(source: str, translation: str) -> dict:
    """
    Novel evaluation metric for Indonesian MTEB.
    Not found in VN-MTEB, TR-MTEB, or C-MTEB.
    """
    source_terms = [t for t in CULTURAL_TERMS if t.lower() in source.lower()]
    preserved = [t for t in source_terms if t.lower() in translation.lower()]
    return {
        "preservation_rate": len(preserved) / len(source_terms) if source_terms else 1.0,
        "missing_terms": set(source_terms) - set(preserved)
    }

1.2.2 Register Detection Evaluation

# Novel: Indonesian register (formal/informal) validation
def evaluate_register_preservation(source: str, translation: str) -> dict:
    """
    Evaluates whether translation preserves formality level.
    Critical for Indonesian with distinct formal (baku) and informal (cakap santai) registers.
    """
    FORMAL_MARKERS = ["yang", "dengan", "untuk", "tersebut", "melakukan"]
    INFORMAL_MARKERS = ["yg", "dgn", "utk", "tu", "lakuin", "gan", "deh"]

    source_formality = classify_register(source)
    translation_formality = classify_register(translation)

    return {
        "register_preserved": source_formality == translation_formality,
        "source_register": source_formality,
        "translation_register": translation_formality
    }

1.2.3 Code-Mixing Validation

# Novel: Indonesian-English code-mixing detection
# A phenomenon increasingly common in Indonesian social media
def detect_code_mixing(text: str) -> dict:
    """
    Detects and characterizes Indonesian-English code-mixing.
    Novel evaluation dimension not present in other MTEBs.
    """
    tokens = word_tokenize(text)
    lang_ids = [language_id(t) for t in tokens]

    switches = sum(1 for i in range(1, len(lang_ids)) if lang_ids[i] != lang_ids[i-1])

    return {
        "has_code_mixing": switches > 0,
        "switch_count": switches,
        "dominant_lang": max(set(lang_ids), key=lang_ids.count),
        "mixing_ratio": lang_ids.count("en") / len(tokens)
    }

1.3 Novel Methodological Contributions

Contribution Description Why Novel
EN-ID Linguistic Proximity Analysis Systematic analysis of how linguistic proximity affects kept ratios VN-MTEB (EN-VI) and TR-MTEB (EN-TR) analyze different language pairs; EN-ID has unique characteristics
Cultural Term Impact Study First study on how cultural terms affect embedding similarity No existing benchmark evaluates this
Regional Language Cross-Lingual Transfer Evaluation of how Indonesian embeddings transfer to Javanese/Sundanese SEA-BED covers multiple languages but doesn't study transfer learning
AI-Generated Dataset Validation Protocol Framework for validating LLM-generated datasets for novel tasks Goes beyond translation-focused approaches

2. Positioning Relative to Existing Benchmarks

2.1 Comparative Analysis

Benchmark Languages Tasks Datasets Translation Source Novelty Gap
MTEB 112 8 1,308+ N/A (original) No Indonesian focus
MMTEB 250+ 8 500+ Mixed Indonesian coverage sparse
C-MTEB 1 (zh) 6 35 Translation Single language focus
VN-MTEB 1 (vi) 6 41 Translation No regional languages
TR-MTEB 1 (tr) 6 26 Translation No cultural evaluation
SEA-BED 10 9 169 29% translated Limited Indonesian datasets
ArabicMTEB Arabic dialects 8 47 Mixed Dialect focus, not typological
AfriMTEB 59 14 38 Mixed African language focus
Indonesia-MTEB 5+ (id, jv, su, ms, en) 8 50-100+ Aggregation + Translation + AI First comprehensive Indonesian embedding benchmark with cultural evaluation

2.2 Unique Value Proposition

Indonesia-MTEB is the only benchmark that:

  1. Covers all 8 MTEB task categories for Indonesian (existing benchmarks cover 0-6 tasks)

  2. Evaluates cultural term preservation in machine-translated datasets (no existing benchmark has this)

  3. Incorporates regional language evaluation (Javanese, Sundanese, Malay) alongside Indonesian

  4. Validates code-mixing detection for Indonesian-English social media text

  5. Combines three data strategies (aggregation, translation, AI generation) in a unified framework

  6. Provides linguistic proximity analysis for Austronesian language pair (EN-ID)

  7. Offers register-aware evaluation for formal/informal Indonesian distinction

2.3 Competitive Advantages

Dimension Indonesia-MTEB Advantages
Coverage Only benchmark with 8/8 tasks for Indonesian
Quality 3-stage validation pipeline with cultural term preservation
Cultural Sensitivity Novel evaluation of Indonesian cultural concepts
Regional Integration First to evaluate Javanese/Sundanese cross-lingual transfer
Sociolinguistic Awareness Code-mixing and register evaluation
Data Diversity Combines existing datasets + translation + AI generation

3. Publication Strategy

3.1 Target Venues

3.1.1 Primary Targets (Tier 1)

Venue Deadline Acceptance Rate Fit
ACL 2026 Feb 2026 ~22% Highest prestige, strong fit for resource paper
EMNLP 2026 May/June 2026 ~22% Strong fit for empirical benchmark paper
NAACL 2026 Sep/Oct 2025 ~23% Regional relevance (Americas focus decreasing)
COLING 2026 Early 2026 ~25% International focus, good for multilingual work

3.1.2 Secondary Targets (Specialized Tracks)

Venue Track Deadline Fit
NeurIPS Datasets & Benchmarks Datasets Track ~May 2026 Strong fit for novelty
ICLR Datasets & Benchmarks ~Sep 2026 Growing NLP presence
AAAI Senior/General Track ~August 2026 AI breadth, good fit
AACL Main Track ~Mid 2026 Asian language focus

3.1.3 Backup Options

Venue Notes
TACL Journal format; after conference rejection
Findings of ACL/EMNLP/NAACL After main conference rejection
LREC Language resources focus
INTERSPEECH If focusing on spoken/parallel datasets

3.2 Submission Timeline Strategy

Phase 1: Preparation (3-4 months before target deadline)
├── Complete dataset creation and validation
├── Run benchmark experiments
├── Draft paper
└── Pre-print on arXiv (builds citations)

Phase 2: ARR Submission (2-3 months before deadline)
├── Submit to ACL Rolling Review
├── Select target venue (e.g., EMNLP 2026)
├── Address reviewer feedback
└── Resubmit if needed

Phase 3: Conference Selection
├── Upon acceptance, select from ACL/EMNLP/NAACL
├── Prepare camera-ready version
└── Prepare presentation materials
Phase Venue Purpose Timeline
1. Pre-print arXiv Establish priority, gather feedback Month 1
2. Workshop AACL/CL/WS Refine methodology, build community Month 3-4
3. Main Conference ACL/EMNLP Primary publication target Month 6-8
4. Journal Extension TACL/CL Extended version with additional analysis Month 12+

4. Addressing Reviewer Concerns

4.1 Common Reviewer Criticisms and Responses

Concern Anticipated Question Response Strategy
"Just translation of MTEB" What's novel beyond translation? Emphasize: (1) Cultural term preservation framework, (2) Regional language integration, (3) Aggregation of 50+ existing Indonesian datasets, (4) AI-generated datasets for novel tasks
"Limited to Indonesian" Why does this matter globally? Position as: (1) 4th most spoken language (270M+ speakers), (2) Proxy for Austronesian languages, (3) Case study in cultural preservation, (4) Bridge between SEA and global NLP
"Translation quality issues" How do you ensure quality? Cite: (1) 3-stage validation pipeline, (2) Empirical similarity thresholds, (3) Kept ratio transparency, (4) Human calibration data
"Existing work covers this" What about NusaCrowd/SEACrowd? Differentiate: (1) NusaCrowd is a data hub, not embedding benchmark, (2) SEACrowd/SEA-BED lack Indonesian cultural evaluation, (3) Indonesia-MTEB is embedding-specific with 8/8 task coverage
** "LLM-generated data concerns"** Is synthetic data valid? Response: (1) Only for novel tasks (Clustering, Reranking) with no existing Indonesian data, (2) Rigorous validation pipeline, (3) Transparency in data cards
"Benchmark drift" Won't this become outdated quickly? Response: (1) Open-source framework allows community updates, (2) MTEB integration ensures longevity, (3) Version control and documentation

4.2 "So What?" Test

Reviewers will ask: Why does this matter?

Tier 1 Responses (Societal Impact): 1. Language Justice: 270M Indonesian speakers deserve embedding evaluation parity with English/Chinese 2. Cultural Preservation: Framework for preserving cultural concepts in ML pipelines 3. Regional Language Support: First benchmark to evaluate Javanese (98M speakers) and Sundanese (42M speakers) embeddings

Tier 2 Responses (Technical Contributions): 1. Linguistic Proximity Insights: EN-ID kept ratios differ from EN-VI (VN-MTEB) and EN-TR (TR-MTEB) due to typological factors 2. Cultural Term Evaluation: Novel framework applicable to other cultures 3. Code-Mixing Evaluation: First systematic code-mixing validation for embeddings

Tier 3 Responses (Research Infrastructure): 1. MTEB Integration: Adds Indonesian to the global embedding evaluation ecosystem 2. Open Resource: Enables Indonesian researchers to evaluate locally-relevant models 3. Benchmark Diversity: Addresses geographic imbalance in NLP benchmarks

4.3 Positioning Statement

"Indonesia-MTEB addresses a critical gap in the embedding evaluation landscape: the absence of comprehensive benchmarks for the world's fourth most spoken language. Beyond translating existing resources, we introduce novel evaluation frameworks for cultural preservation, code-mixing, and register—phenomena central to Indonesian sociolinguistics but absent from existing benchmarks. Our work provides both immediate utility for Indonesian NLP and generalizable insights for culturally-aware embedding evaluation."


5. Novelty Deep Dives

5.1 Cultural Term Preservation Framework

Novel Research Questions:

  1. How do embedding similarity scores correlate with cultural term preservation?
  2. Which translation models best preserve Indonesian cultural concepts?
  3. Does cultural term loss affect downstream embedding performance?

Methodology:

# Novel evaluation: Cultural term semantic drift
def measure_cultural_semantic_drift(source: str, translation: str,
                                    embedder, cultural_terms: dict) -> float:
    """
    Measures whether cultural terms maintain semantic similarity
    across translation. Novel contribution not in other MTEBs.
    """
    source_embedding = embedder.encode(source)
    translation_embedding = embedder.encode(translation)

    baseline_similarity = cosine_similarity(source_embedding, translation_embedding)

    # Extract sentences containing cultural terms
    cultural_sentences = extract_cultural_sentences(source, cultural_terms)
    translated_cultural = extract_cultural_sentences(translation, cultural_terms)

    if len(cultural_sentences) == 0:
        return baseline_similarity

    cultural_similarity = cosine_similarity(
        embedder.encode(cultural_sentences),
        embedder.encode(translated_cultural)
    ).mean()

    return {
        "baseline_similarity": baseline_similarity,
        "cultural_similarity": cultural_similarity,
        "drift": baseline_similarity - cultural_similarity
    }

Publication Angle: - Empirical study of cultural concept preservation in machine translation - Framework applicable to other under-resourced cultures - Insights for multilingual embedding model training

5.2 Code-Mixing Evaluation

Novel Research Questions:

  1. Do embedding models treat code-mixed Indonesian-English differently from monolingual text?
  2. How does code-mixing affect retrieval and semantic similarity performance?
  3. Can embeddings identify language boundaries in code-mixed text?

Novel Dataset Component:

# Novel: Code-mixing evaluation dataset
class IndonesianCodeMixingDataset:
    """
    Novel dataset for evaluating code-mixed Indonesian-English embeddings.
    First of its kind in the MTEB ecosystem.
    """
    def __init__(self):
        self.tasks = [
            "code_mixing_classification",  # Detect if text is code-mixed
            "language_boundary_detection",  # Identify ID/EN boundaries
            "code_mixed_similarity",  # Semantic similarity in code-mixed pairs
            "code_mixed_retrieval"  # Retrieval with code-mixed queries
        ]

Publication Angle: - First systematic code-mixing evaluation for text embeddings - Indonesian as a case study for global code-mixing phenomena - Insights for embedding model development in multilingual societies

5.3 Regional Language Cross-Lingual Transfer

Novel Research Questions:

  1. Can Indonesian embeddings transfer to Javanese/Sundanese without retraining?
  2. How does cross-lingual transfer performance compare to dedicated models?
  3. Which tasks show better cross-lingual transfer?

Novel Evaluation:

# Novel: Regional language cross-lingual transfer evaluation
def evaluate_cross_lingual_transfer(indonesian_model, javanese_texts,
                                     javanese_labels, task: str) -> dict:
    """
    Evaluates how well Indonesian-trained embeddings perform on Javanese.
    Novel contribution: no existing benchmark studies Austronesian transfer.
    """
    # Encode Javanese texts with Indonesian model
    embeddings = indonesian_model.encode(javanese_texts)

    # Evaluate on task (classification, clustering, etc.)
    performance = evaluate_task(embeddings, javanese_labels, task)

    return {
        "cross_lingual_performance": performance,
        "task": task,
        "source_language": "id",
        "target_language": "jv"
    }

Publication Angle: - Austronesian language family transfer learning - Resource efficiency for regional languages - Implications for multilingual embedding model design


6. Novelty Claims by Paper Section

6.1 Abstract Novelty Hooks

We introduce Indonesia-MTEB, the first comprehensive text embedding benchmark
for Indonesian covering all 8 MTEB task categories. Despite 270+ million
speakers, Indonesian lacks embedding evaluation infrastructure. Our benchmark
offers three novel contributions: (1) a [CULTURAL TERM PRESERVATION FRAMEWORK]
for evaluating translation quality, (2) first systematic [CODE-MIXING EVALUATION]
for Indonesian-English embeddings, and (3) [REGIONAL LANGUAGE TRANSFER ANALYSIS]
for Javanese and Sundanese. Through a three-pronged data strategy—aggregating
50+ existing datasets, translating MTEB resources, and AI-generating novel tasks—
we provide 50-100+ datasets across all embedding task types. Experiments on
18 models reveal unique patterns in [AUSTRONESIAN EMBEDDING PERFORMANCE], with
implications for multilingual embedding development.

6.2 Introduction Novelty Claims

Paragraph Novelty Emphasis
Motivation Indonesian = 4th most spoken language; zero comprehensive embedding benchmarks
Gap Analysis Existing work (IndoNLU, NusaCrowd) focus on classification; embedding evaluation missing
Our Contribution 8/8 MTEB tasks + cultural framework + regional languages
Implications Model for other under-resourced languages; cultural preservation insights

6.3 Methodology Novelty Claims

Component Novelty Description
3-Stage Validation Adapted from VN-MTEB but with Indonesian-specific thresholds (≥0.75 vs ≥0.80)
Cultural Term Validation Novel; no equivalent in any MTEB variant
Code-Mixing Detection Novel for embedding evaluation
Register Preservation Novel; leverages Indonesian formal/informal distinction
Regional Transfer Novel; Austronesian language family analysis

6.4 Experiments Novelty Claims

Experiment Novelty Hook
Kept Ratio by Task First EN-ID analysis; differs from EN-VI (VN-MTEB) and EN-TR (TR-MTEB)
Cultural Term Impact Novel study of cultural concepts on embedding similarity
Model Comparison RoPE vs APE analysis in Indonesian context
Regional Transfer First Javanese/Sundanese embedding evaluation
Code-Mixing Performance Novel task type for Indonesian

7. Angle-Specific Publication Strategies

7.1 Angle 1: Cultural Preservation in NLP

Core Thesis: Machine translation and embedding evaluation must account for cultural concepts that lack direct translations.

Target Venues: AACL, ACL, EMNLP

Novel Emphasis: - Cultural term preservation framework - Empirical analysis of translation's impact on cultural concepts - Implications for culturally-aware NLP systems

Potential Title:

"Indonesia-MTEB: A Cultural Preservation Framework for Text Embedding Benchmarks"

7.2 Angle 2: Code-Mixing and Sociolinguistics

Core Thesis: Embedding models must handle real-world sociolinguistic phenomena like code-mixing and register variation.

Target Venues: EMNLP, NAACL, ACL

Novel Emphasis: - Code-mixing detection and evaluation - Register-aware embedding evaluation - Real-world Indonesian social media validation

Potential Title:

"Beyond Monolingual: Evaluating Embeddings on Code-Mixed Indonesian Text"

7.3 Angle 3: Regional Language Support

Core Thesis: Benchmarks for major languages should enable evaluation for related regional languages.

Target Venues: AACL, COLING, LREC

Novel Emphasis: - Javanese and Sundanese evaluation - Cross-lingual transfer within Austronesian family - Resource-efficient multilingual embedding development

Potential Title:

"Indonesia-MTEB: Benchmarking Embeddings for Indonesian and Regional Languages"

7.4 Angle 4: Translation Quality for Embeddings

Core Thesis: Machine translation for embedding evaluation requires different quality metrics than traditional MT.

Target Venues: WMT, EMNLP, ACL

Novel Emphasis: - Semantic similarity thresholds for EN-ID - Kept ratio analysis by task type - Comparison with VN-MTEB (EN-VI) and TR-MTEB (EN-TR)

Potential Title:

"Translation Quality for Embedding Evaluation: The EN-ID Case Study"


8. Building Citation Potential

8.1 Citable Components

Each dataset and methodology component should be independently citable:

Component Citation Type Expected Citations
Individual Datasets Dataset (HuggingFace) Per dataset use
Benchmark Framework Conference paper Primary citation
Cultural Framework Method paper Cultural NLP work
Code-Mixing Protocol Method paper Code-mixing research
Kept Ratio Analysis Analysis paper Translation research

8.2 Pre-Publication Strategy

Before Conference Submission:

  1. arXiv Pre-print (Month 1): Establish priority
  2. HuggingFace Datasets (Month 2): Enable early adoption
  3. Blog Post (Month 2): Build community awareness
  4. Workshop Paper (Month 3-4): Refine methodology

Target Workshops: - AACL Workshop (Asian Language Resources) - Workshop on NLP for Indigenous Languages - Workshop on Translation and Semantics - Workshop on Benchmarking

8.3 Building a Citation Ecosystem

% Primary citation
@inproceedings{indonesia_mteb_2026,
  title={Indonesia-MTEB: A Comprehensive Text Embedding Benchmark for Indonesian},
  author={Authors},
  booktitle={ACL/EMNLP},
  year={2026}
}

% Dataset citation
@dataset{indonesia_mteb_datasets,
  title={Indonesia-MTEB Dataset Collection},
  author={Authors},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/indonesia-mteb}
}

% Cultural framework citation
@inproceedings{cultural_preservation_2026,
  title={Cultural Term Preservation in Machine Translation for Embedding Evaluation},
  author={Authors},
  booktitle={Workshop on Culturally-Aware NLP},
  year={2026}
}

9. Risk Mitigation

9.1 Identified Risks and Mitigations

Risk Probability Impact Mitigation
Reviewer: "Just translation" High High Emphasize cultural framework, regional languages, AI-generated datasets
Reviewer: "Not generalizable" Medium Medium Position as case study for Austronesian, cultural preservation framework
MTEB rejection Low Medium Engage with MTEB maintainers early; prepare independent release
Translation quality concerns Medium Medium Transparent kept ratios; human calibration data; multiple validation stages
Competition from SEA-BED Medium Low SEA-BED covers 10 languages shallowly; Indonesia-MTEB goes deep on Indonesian
Code-mixing criticism Low Low Limited to specific datasets; not main contribution

9.2 Contingency Plans

If Rejected from Main Conference:

  1. Findings Track: Resubmit to same venue's Findings
  2. Alternative Venue: Submit to different tier-1 conference
  3. Workshop + Arxiv: Build citations, resubmit to main venue
  4. Journal Submission: Convert to TACL/CL journal submission

If MTEB Integration Fails:

  1. Independent Release: Publish on HuggingFace with custom evaluation code
  2. Community Engagement: Build Indonesia-MTEB community
  3. Alternative Framework: Explore integration with other evaluation frameworks

10. Novelty Checklist

10.1 Pre-Submission Novelty Verification

  • Cultural Term Preservation Framework is clearly described and empirically validated
  • Code-Mixing Evaluation component is included (even if limited)
  • Regional Language Analysis (Javanese/Sundanese) is present
  • Kept Ratio Analysis includes comparison with VN-MTEB and TR-MTEB
  • 3-Pronged Strategy (aggregation + translation + AI) is emphasized
  • MTEB Integration is documented or in progress
  • Open Source Release on HuggingFace is completed
  • Cultural Sensitivity is acknowledged and addressed
  • Linguistic Proximity Analysis (EN-ID vs EN-VI vs EN-TR) is included
  • Broader Impact Statement addresses language justice

10.2 Novelty Narrative Checklist

  • Abstract clearly states unique contributions
  • Introduction quantifies the gap (270M speakers, 0 comprehensive benchmarks)
  • Related Work positions relative to all major MTEBs
  • Methodology section describes novel validation frameworks
  • Experiments section includes unique analyses (cultural, code-mixing, regional)
  • Discussion generalizes findings beyond Indonesian
  • Conclusion articulates implications for other under-resourced languages

11. Summary: Core Novelty Statement

Indonesia-MTEB's unique contributions:

  1. Coverage: First comprehensive Indonesian embedding benchmark (all 8 MTEB tasks)

  2. Cultural Framework: Novel evaluation of cultural term preservation in translation

  3. Sociolinguistic Awareness: First code-mixing and register evaluation for Indonesian embeddings

  4. Regional Integration: Evaluation of Javanese and Sundanese alongside Indonesian

  5. Linguistic Proximity Insights: Empirical analysis of EN-ID translation quality by task type

  6. 3-Pronged Strategy: Combines aggregation, translation, and AI generation

  7. Open Resource: Community-driven integration with MTEB ecosystem

Positioning: Indonesia-MTEB serves both as an immediate resource for Indonesian NLP and as a case study for culturally-aware, sociolinguistically-informed embedding evaluation applicable to other under-resourced languages and cultures.


12.1 Must-Cite Benchmarks

Benchmark Citation Key Differentiation
MTEB Muennighoff et al., 2023 Original framework
MMTEB Enevoldsen et al., 2025 Multilingual expansion
VN-MTEB Pham et al., 2025 Translation pipeline
TR-MTEB Baysan et al., 2025 Turkish benchmark
SEA-BED Ponwitayarat et al., 2025 SEA regional coverage
C-MTEB Chinese benchmark

12.2 Must-Cite Indonesian Resources

Resource Citation Use
IndoNLU Wilie et al., 2020 Indonesian NLP tasks
Indo4B Wilie et al., 2020 Pre-training corpus
NusaCrowd Cahyawijaya et al., 2023 Indonesian dataset hub
SEACrowd SEACrowd Consortium, 2024 SEA dataset hub

12.3 Supporting Literature

Topic Key Citations
Cultural NLP R transparent
Code-Mixing
Embedding Evaluation
Low-Resource Languages
Machine Translation Quality

Next Steps: 1. Finalize datasets and validation 2. Run benchmark experiments 3. Draft paper following novelty framework 4. Submit to ARR for target venue 5. Engage with MTEB community for integration