ML.VisionToText¶

Name	Mandatory	Description	Default	Type
`⬅️ Input`		The input of the shard, if any		`Image`
`Output ➡️`		The resulting output of the shard		`String`
`Model`	No	The Moondream2 model to use.	`none`	`Var(Model)`
`Tokenizer`	No	The tokenizer to use.	`none`	`Var(Tokenizer)`
`Prompt`	No	The prompt to use for the vision-to-text generation.	`none`	`String`
`Temperature`	No	Temperature for text generation (0.0 for deterministic output).	`0.5`	`Float`
`TopP`	No	Top-p sampling value (0.0-1.0, 0.0 to disable).	`0.9`	`Float`
`RepeatPenalty`	No	Penalty for repeating tokens (1.0 means no penalty).	`1`	`Float`
`MaxTokens`	No	Maximum number of tokens to generate.	`512`	`Int`
`GPU`	No	Whether to use the GPU (if available).	`false`	`Bool`
`Seed`	No	The seed to use for the generation.	`42`	`IntVar(Int)`

Complete vision-to-text pipeline using Moondream2 model. Takes an image tensor as input and outputs text based on a prompt.