profile

AI Tech Circle

Large Language Models

Published 6 months ago • 4 min read

Open Tech Talks newsletter!

Large Language Models

Generative AI has brought Artificial intelligence to the masses, and everyone has started talking, experiencing, and utilizing it in day-to-day life. When it comes to enterprise space, earlier this month, we covered the most critical area of implementing Large Language models by organizations through opting for use cases, as everyone is trying to find out different use cases that can be incorporated or targeted for the implementation. I have covered a few use cases in the retail industry and a few already implemented examples.

This week, I thought of going over a few of the open-source and commercially available large language models available in the market.

Large Language Models

  1. Llama 2: trained on 2 trillion tokens, has double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations. Llama Chat: Llama 2 was pre-trained on publicly available online data sources. Code Llama: Code Llama is a code generation model built on Llama 2, trained on 500B tokens of code. It supports common programming languages, including Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash.
  2. Dolly: It is trained for less than $30 to display ChatGPT-like human interactivity. Dolly 2.0 is a 12B parameter language model based on the EleutherAI Pythia model family and fine-tuned exclusively on a new, high-quality human-generated instruction following dataset, crowdsourced by Databricks employees. The dataset is the first open-source, human-generated instruction dataset specifically designed to make large language models exhibit the magical interactivity of ChatGPT. More than 5,000 Databricks employees authored Databricks-dolly-15k during March and April of 2023. These training records are natural, expressive, and designed to represent various behaviors, from brainstorming and content generation to information extraction and summarization.
  3. Guanaco is an advanced instruction-following language model based on Meta's LLaMA 7B model. Building upon the initial dataset of 52K from the Alpaca model, it has incorporated an additional 534,530 entries. These cover languages such as English, Simplified Chinese, Traditional Chinese (Taiwan), Traditional Chinese (Hong Kong), Japanese, and Deutsch, as well as various linguistic and grammatical tasks.
  4. BLOOM: The World’s Largest Open Multilingual Language Model 176 billion parameters, BLOOM can generate text in 46 natural languages and 13 programming languages.
  5. LLaVA Large Language and Vision Assistant is a model that blends vision and language understanding. It's like a visual version of GPT-4 and sets high standards in Science QA accuracy.
  6. Vicuna-13B is an open-source chatbot based on LLaMA and fine-tuned with ShareGPT conversations. Vicuna LLM creates text that feels natural and is both engaging and informative.
  7. FLAN-T5 is a model created by Google Research. It's trained on various tasks, both supervised and unsupervised, and turns them into a text-to-text format. It's a version of the google’s T5 model.
  8. Falcon: Created by Abu Dhabi's Technology Innovation Institute (TII), it has two models: Falcon-40B and Falcon-7B. These models process web data uniquely by removing duplicates and using a special filtering system. With multi-query attention, these models work faster and better. Falcon can write like humans, translate languages, and respond to questions.

I had a chance to visit the Flacon showcase during the GITEX exhibition.

Proprietary/commercial Large language Foundational models:

  • AI21:
    • J2 Ultra Instruct
    • J2 Mid Instruct
    • AI21 Summarize
  • Anthropic:
    • Claude
  • Cohere:
    • Generate Model Command
    • Generate Model Command-Light
  • LightOn:
    • Lyra-Fr 10B
  • Stability AI:
    • SDXL
  • Amazon:
    • Titan Text Large

News & Updates...

This week has seen a storm of new AI features and products announced, fueling the technology revolution.

  1. A blog post on Multi-GPU multinode fine-tuning Llama2 on OCI Data Science
  2. Revolutionizing AI-driven research with Cleveland Clinic and OCI
  3. Gen AI Navigator from Google Cloud, is a guide for you to adopt Gen AI.
  4. Frontier risk and preparedness is an initiative from OpenAI.
  5. Prompts are key in 2023: Twenty-five tips to help you unlock the potential of generative AI.

Potential of AI

Things to Know

  • Anthropic, Google, Microsoft, and OpenAI have announced the Frontier Model Forum and the creation of a new AI Safety Fund, with more than $10 million initiative to promote research in the field of AI safety
  • Policy paper Emerging Processes for Frontier AI Safety by the UK Government
  • NEOM and Pony.ai established a joint venture to develop, manufacture, and deliver autonomous vehicles, an autonomous driving service, and smart vehicle infrastructure.

The Opportunity...

Podcast:

Courses to attend:

Events:

Tech and Tools...

Until next week,

Kashif Manzoor

You have registered on OTechTalks.tv over the last five years. If you don’t want to receive it, please unsubscribe; you will not get it next time.


The opinions expressed here represent solely my own personal conjecture based upon experience, practice, and observation and do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.

AI Tech Circle

Kashif Manzoor

Learn something new every Saturday about #AI #ML #DataScience #Cloud and #Tech with Weekly Newsletter. Join with 278+ AI Enthusiasts!

Read more from AI Tech Circle

AI Tech Circle Hey Reader! This week, I had a chance to dig deep into the report published by the German Federal Office for Information Security, "Generative AI Models - Opportunities and Risks for Industry and Authorities." This report has covered a few areas, such as the planning, development, and operation phases of generative AI models, where a systematic risk analysis should be conducted. For those of us involved in organizational projects that employ Large language models, it's crucial...

7 days ago • 4 min read

AI Tech Circle Hey Reader! I am back this week after taking a break for the Eid holidays. Recently, the UAE experienced the heaviest rainfall in its history, which posed significant challenges. The UAE leadership stepped forward to address the situation, and civil service departments and volunteers worked tirelessly day and night to support the residents affected by the heavy rain. This week started with the event 'MachinesCanSee' at the Museum of the Future, Dubai. It was quite a good...

15 days ago • 5 min read

AI Tech Circle Hey Reader! This weekend, my attention was drawn to the lecture from Yann LeCun, a Professor at NYU and Chief AI Scientist at Meta, titled 'Objective-Driven AI: Towards AI systems that can learn, remember, reason, plan, have common sense, yet are steerable and safe' and four design patterns for agents from Andrew Ng. The holiday period has started, and there will be no newsletter next week. An advance Eid Mubarak to whoever is celebrating. The first slide of 97 Slidedeck starts...

about 1 month ago • 4 min read
Share this post