The world of AI isabuzz with the recent introduction of Llama Datasets, a new initiative aimed atfacilitating the benchmarking of RAG (Retrieval-Augmented Generation) pipelinesfor various use cases within the AI community. The initiative comprises community-contributeddatasets that include question-answer pairs and source context, allowing usersto evaluate their RAG pipelines effectively. OpenAI has launched with aninitial set of 10 evaluation datasets, providing various scenarios fordevelopers to assess their systems.
Evaluating RAG systemsposes a challenge due to the stochastic nature of language models and machinelearning systems in general. Unlike traditional software systems withdeterministic behavior, LLM (large language model) systems are designed asstochastic black-boxes, making it challenging to define unit tests. Developersmust establish an evaluation dataset of their production use cases and assesstheir systems using appropriate metrics.
Llama Datasetsaddresses the complexity of defining the proper evaluation dataset by providinga hub where users can easily access and select datasets based on their specificuse cases. Instead of prescribing particular data, OpenAI offers a flexible platformfor users to choose datasets relevant to their applications.
The launch includesthe Llama Datasets available on LlamaHub and the RagEvaluatorPack to computemetrics over a dataset. Additionally, users can work with dataset abstractionsindependently. To use a Llama Dataset, users can download it from LlamaHub andutilize the RagEvaluatorPack or their evaluation modules. To contribute,developers can submit a "data card" to LlamaHub and upload rawdataset files to the llama_datasets repository.
The initial tendatasets cover diverse domains, including Blockchain Solana, Coda Help Desk,FinanceBench, Paul Graham Essays, Llama 2 Paper, Uber/Lyft 2021 10K Filings,Mini Truthful QA, Mini Squad V2, Origin of COVID-19, and LLM Survey Paper.OpenAI plans to add more datasets to enhance the range of available scenariosfor evaluation.
Meta has introducedAudiobox, the latest advancement in generative AI for audio. As the successorto Voicebox, Audiobox unifies the capabilities of generating and editing audio,encompassing speech, sound effects, and soundscapes. It stands out by enablingusers to create voices and sound effects through a combination of voice inputsand natural language text prompts, simplifying the creation of custom audio fordiverse applications.
Audiobox allows usersto provide natural language prompts to describe the desired sound or type ofspeech they want to generate. For example, users can input prompts like "Arunning river and birds chirping" to generate soundscapes or "A youngwoman speaks with a high pitch and fast pace" for voice generation. Thisflexibility makes Audiobox a versatile tool for content creators anddevelopers, offering state-of-the-art controllability in speech and soundeffects generation.
One of Audiobox'sunique features is its ability to support dual input—users can combine an audiovoice input with a text-style prompt to synthesize speech or emotions indifferent environments. This dual input approach, supporting both voice andtext prompts, is a first in freeform voice restyling.
Audiobox is built onthe Voicebox framework but extends its capabilities to generate more sounds,including speech in various environments and styles, non-speech sound effects,and soundscapes. It emphasizes the importance of democratizing audio generation,allowing professionals and hobbyists to easily create tailored audio forvideos, podcasts, games, and more.
To address concernsrelated to voice impersonation and misuse, Meta has implemented automatic audiowatermarking in Audiobox, allowing audio to be accurately traced to its origin.Additionally, the interactive demo of Audiobox includes a voice authenticationfeature to prevent impersonation attempts.
Meta encouragesresponsible collaboration with the research community and is releasing Audioboxunder a research-only license to a limited number of selected researchers andinstitutions. The company emphasizes the importance of responsible AI use andaims to address potential issues with voice impersonation through innovativetechnologies.
The release ofAudiobox underlines Meta's commitment to advancing responsible AI research andfostering collaboration within the research community.
IBM and Meta havejointly launched the AI Alliance, an international community comprising over 50founding members and collaborators, including notable organizations like AMD,CERN, Dell Technologies, Intel, NASA, Oracle, Red Hat, Sony Group, and numerousuniversities. The AI Alliance fosters collaboration among technologydevelopers, researchers, and adopters to advance open, safe, and responsibleAI.
The initiative isdriven by the recognition that AI advancements offer significant opportunitiesto enhance various aspects of life, and open and transparent innovation iscrucial to empowering a diverse range of AI stakeholders. While individualentities are already committed to open science and technologies in AI, the AIAlliance seeks to enhance collaboration and information sharing to accelerateinnovation more inclusively.
The AI Alliance isdesigned to be action-oriented and international, focusing on creatingopportunities across diverse institutions to shape the evolution of AI thatreflects the complexities of societies. The primary objectives includefostering an open community, enabling developers and researchers to accelerateresponsible innovation, and ensuring scientific rigor, trust, safety, security,diversity, and economic competitiveness.
Several key projectsare planned under the AI Alliance, including developing and deployingbenchmarks, evaluation standards, tools, and resources for responsible AIsystem development. The alliance will support open foundation models withdiverse modalities, encourage AI hardware accelerator ecosystem development,facilitate global AI skills building and exploratory research, and developeducational content for public discourse and policymaker awareness.
To execute theseobjectives, the AI Alliance will establish member-driven working groups acrossvarious areas and form a governing board and technical oversight committee. Thealliance plans to collaborate with existing initiatives from governments, non-profits,and civil society organizations.
The founding membersand collaborators of the AI Alliance, including IBM and Meta, emphasize theimportance of responsible AI use and believe that the alliance will contributeto the responsible development and deployment of AI technologies. The collaborativeeffort aims to address safety concerns, share knowledge, and develop solutionsthat meet the diverse needs of AI researchers, developers, and adoptersglobally.