Enterprise developers and company leaders understand the significance of APIs as the cornerstone of modern software development, sitting atop tech platforms to allow third-party apps to integrate and connect. Recently, OpenAI made substantial enhancements to its GPT-4 Turbo large language model API.
Today, OpenAI announced on its X accounts that its GPT-4 Turbo with Vision model has become generally available through its API. GPT-4’s vision capabilities were unveiled alongside audio uploads in September 2023; then at OpenAI’s developer conference in November, GPT-4 Turbo promised speed improvements, larger input context windows (up to 128,000 tokens — equivalent to 300-page books or documents), increased affordability, and speed upgrades.
Requests to access OpenAI’s vision recognition and analysis capabilities can now be submitted using JSON text format or function calling, creating a JSON code snippet which developers can then use in their connected apps to automate tasks such as sending email, posting something online or making purchases; OpenAI strongly advises creating user confirmation flows before taking actions that impact the world on behalf of users.
OpenAI spokespersons indicate that these changes help developers streamline their workflow and produce more efficient apps; previously, developers had to utilize multiple models for text and images separately, but now with just one API call the model can analyze images while applying reasoning.
OpenAI highlights several examples of customers already taking advantage of GPT-4 Turbo with Vision, including innovative startup Cognition’s autonomous AI coding agent Devin that uses it to automatically generate full code on behalf of the user…