meta-llama/Llama-3.3-70B-Instruct · What Happens If the Prompt Exceeds 8,196 Tokens? And difference between input limit and context length limit?

Dear community members, I found that the maximum token limit for a prompt is 8,196 tokens. What happens if I provide a prompt longer than this limit? Will the prompt be automatically truncated, with only the first 8,196 tokens being processed? I tested this and didn't encounter any errors, so I'm wondering how the model handles prompts that exceed the limit.

Also, I'm curious about the difference between the input limit and the context length limit. Since LLaMA 3 has a context length of 128k tokens, does that mean we can use iterative prompting strategies to process longer texts effectively? If so, how does the model handle prompts that exceed the input limit within a single request?

Any help or explanation is appreciated! Thanks : )