NVIDIA NCA-GENL Prompting, RAG, and Evaluation

1. NCA-GENL トピック 3 問題 21. NCA-GENL Topic 3 Question 2

問題Question

「英語をフランス語に翻訳: チーズ =>」というプロンプトの例は何ですか?

A. 少数ショット学習
B. モデルを微調整する
C. ワンショット学習
D. ゼロショット学習

What is the prompt “Translate English to French: cheese =>” an example of?

A. Few-shot learning
B. Fine tuning a model
C. One-shot learning
D. Zero-shot learning

2. NCA-GENL トピック 3 問題 32. NCA-GENL Topic 3 Question 3

問題Question

言語モデルによって生成されたテキストの品質を評価するために主にどの指標が使用されますか?

A. 困惑
B. 精度
C. 想起
D. 正確さ

Which metric is primarily used to evaluate the quality of the text generated by language models?

A. Perplexity
B. Precision
C. Recall
D. Accuracy

3. NCA-GENL トピック 3 問題 43. NCA-GENL Topic 3 Question 4

問題Question

AI 実験の分野で、パフォーマンスを評価するために使用される GLUE ベンチマークは何ですか?

A. 音声認識タスクの AI モデル。
B. 画像認識タスクの AI モデル。
C. AI は、さまざまな自然言語理解タスクをモデル化します。
D. 強化学習タスクの AI モデル。

In the field of AI experimentation, what is the GLUE benchmark used to evaluate performance of?

A. AI models on speech recognition tasks.
B. AI models on image recognition tasks.
C. AI models on a range of natural language understanding tasks.
D. AI models on reinforcement learning tasks.

4. NCA-GENL トピック 3 問題 64. NCA-GENL Topic 3 Question 6

問題Question

翻訳タスクの変圧器モデルを評価する場合、そのパフォーマンスを評価する一般的なアプローチは何ですか?

A. ソーステキストと比較したモデルの翻訳の語彙の多様性を分析します。
B. モデルの出力を、標準データセット上で人間が生成した翻訳と比較します。
C. さまざまなジャンルのテキストにわたる翻訳のトーンとスタイルの一貫性を評価します。
D. 専門的な翻訳のコーパスと比較して、モデルの翻訳の構文の複雑さを測定します。

In evaluating the transformer model for translation tasks, what is a common approach to assess its performance?

A. Analyzing the lexical diversity of the model’s translations compared to source texts.
B. Comparing the model’s output with human-generated translations on a standard dataset.
C. Evaluating the consistency of translation tone and style across different genres of text.
D. Measuring the syntactic complexity of the model’s translations against a corpus of professional translations.

5. NCA-GENL トピック 3 問題 75. NCA-GENL Topic 3 Question 7

問題Question

機械翻訳モデルを評価するために一般的に使用される指標はどれですか?

A. 平均絶対誤差 (MAE)
B. BLEUスコア
C. F1スコア
D. 正確さ

Which metric is commonly used to evaluate machine-translation models?

A. Mean Absolute Error (MAE)
B. BLEU score
C. F1 score
D. Accuracy

6. NCA-GENL トピック 3 問題 86. NCA-GENL Topic 3 Question 8

問題Question

レコメンデーションシステム用の深層学習モデルを開発しました。 A/B テストを使用してモデルのパフォーマンスを評価したいと考えています。深層学習モデルのパフォーマンスで A/B テストを使用する理論的根拠は何ですか?

A. A/B テストにより、ディープラーニングモデルが堅牢であり、入力データのさまざまなバリエーションを処理できることが保証されます。
B. A/B テストにより、モデルの 2 つのバージョン間の比較を制御して行うことができ、よりパフォーマンスの高いバージョンを特定するのに役立ちます。
C. A/B テスト手法には、深層学習モデルの設計者による理論的根拠と技術的な解説が統合されています。
D. A/B テストは、深層学習モデルのパフォーマンスを評価するための比較レイテンシデータを収集するのに役立ちます。

You have developed a deep learning model for a recommendation system. You want to evaluate the performance of the model using A/B testing. What is the rationale for using A/B testing with deep learning model performance?

A. A/B testing ensures that the deep learning model is robust and can handle different variations of input data.
B. A/B testing allows for a controlled comparison between two versions of the model, helping to identify the version that performs better.
C. A/B testing methodologies integrate rationale and technical commentary from the designers of the deep learning model.
D. A/B testing helps in collecting comparative latency data to evaluate the performance of the deep learning model.

7. NCA-GENL トピック 3 問題 97. NCA-GENL Topic 3 Question 9

問題Question

自然言語処理 (NLP) システムの評価において、評価指標の選択に関して「有効性」と「信頼性」は何を意味しますか?

A. 有効性とは、データの将来の傾向を予測するメトリクスの能力を指し、信頼性とは、複数のデータソースと統合するメトリクスの能力を指します。
B. 有効性により、メトリックが測定対象のプロパティを正確に反映していることが保証され、信頼性により、繰り返しの測定にわたって一貫した結果が保証されます。
C. 妥当性はメトリクスの計算コストに関係し、信頼性はさまざまな NLP プラットフォーム間での適用可能性に関係します。
D. 有効性はメトリクスの計算速度を指しますが、信頼性は大量のデータ処理におけるメトリクスのパフォーマンスに関係します。

In the evaluation of Natural Language Processing (NLP) systems, what do ‘validity’ and ‘reliability’ imply regarding the selection of evaluation metrics?

A. Validity involves the metric’s ability to predict future trends in data, and reliability refers to its capacity to integrate with multiple data sources.
B. Validity ensures the metric accurately reflects the intended property to measure, while reliability ensures consistent results over repeated measurements.
C. Validity is concerned with the metric’s computational cost, while reliability is about its applicability across different NLP platforms.
D. Validity refers to the speed of metric computation, whereas reliability pertains to the metric’s performance in high-volume data processing.

8. NCA-GENL トピック 3 問題 108. NCA-GENL Topic 3 Question 10

問題Question

入力クエリに関連して生成される応答の精度という観点から、RAG ワークフローのパフォーマンスを評価するにはどのような指標を使用しますか? (2つお選びください。)

A. ジェネレータのレイテンシ
B. レトリバーのレイテンシー
C. 1秒あたりに生成されるトークン
D. 応答の関連性
E. コンテキストの精度

What metrics would you use to evaluate the performance of a RAG workflow in terms of the accuracy of responses generated in relation to the input query? (Choose two.)

A. Generator latency
B. Retriever latency
C. Tokens generated per second
D. Response relevancy
E. Context precision

9. NCA-GENL トピック 5 問題 19. NCA-GENL Topic 5 Question 1

問題Question

「幻覚」とは、LLM モデルが何を生成するかを説明するために作られた用語です。

A. 出力は入力データと同様であるだけです。
B. 画像はプロンプトの説明から。
C. 間違った発音結果を修正します。
D. 文法的に間違っている、または壊れた出力。

“Hallucinations” is a term coined to describe when LLM models produce what?

A. Outputs are only similar to the input data.
B. Images from a prompt description.
C. Correct sounding results that are wrong.
D. Grammatically incorrect or broken outputs.

10. NCA-GENL トピック 5 問題 610. NCA-GENL Topic 5 Question 6

問題Question

Retrieval Augmented Generation (RAG) は、開発者が信頼できる AI システムを構築するのにどのように役立ちますか?

A. RAG は AI システムのセキュリティ機能を強化し、機密コンピューティングと暗号化されたトラフィックを保証します。
B. RAG は AI システムのエネルギー効率を向上させ、環境への影響と冷却要件を削減します。
C. RAG は AI モデルを相互に調整し、クロスチェックを通じて AI システムの精度を向上させることができます。
D. RAG は、外部知識ベースからの参考資料を引用した応答を生成し、透明性と検証可能性を確保できます。

How can Retrieval Augmented Generation (RAG) help developers to build a trustworthy AI system?

A. RAG can enhance the security features of AI systems, ensuring confidential computing and encrypted traffic.
B. RAG can improve the energy efficiency of AI systems, reducing their environmental impact and cooling requirements.
C. RAG can align AI models with one another, improving the accuracy of AI systems through cross-checking.
D. RAG can generate responses that cite reference material from an external knowledge base, ensuring transparency and verifiability.

NVIDIA NCA-GENL Prompting, RAG, and Evaluation

1. NCA-GENL トピック 3 問題 21. NCA-GENL Topic 3 Question 2

問題Question

推奨解答Suggested Answer

2. NCA-GENL トピック 3 問題 32. NCA-GENL Topic 3 Question 3

問題Question

推奨解答Suggested Answer

3. NCA-GENL トピック 3 問題 43. NCA-GENL Topic 3 Question 4

問題Question

推奨解答Suggested Answer

4. NCA-GENL トピック 3 問題 64. NCA-GENL Topic 3 Question 6

問題Question

推奨解答Suggested Answer

5. NCA-GENL トピック 3 問題 75. NCA-GENL Topic 3 Question 7

問題Question

推奨解答Suggested Answer

6. NCA-GENL トピック 3 問題 86. NCA-GENL Topic 3 Question 8

問題Question

推奨解答Suggested Answer

7. NCA-GENL トピック 3 問題 97. NCA-GENL Topic 3 Question 9

問題Question

推奨解答Suggested Answer

8. NCA-GENL トピック 3 問題 108. NCA-GENL Topic 3 Question 10

問題Question

推奨解答Suggested Answer

9. NCA-GENL トピック 5 問題 19. NCA-GENL Topic 5 Question 1

問題Question

推奨解答Suggested Answer

10. NCA-GENL トピック 5 問題 610. NCA-GENL Topic 5 Question 6

問題Question

推奨解答Suggested Answer