Back to portfolio
Section 10

Model Evaluation

Rigorous benchmarking of LLMs using metrics like BLEU, ROUGE, and Perplexity.

Projects in this section: 0

LLM Metrics (BLEU/ROUGE)
Model EvaluationLocal path

LLM Metrics (BLEU/ROUGE)

Performance evaluation for text generation.

Summarization Benchmark
Model EvaluationLocal path

Summarization Benchmark

Comparative evaluation of T5 vs GPT2.