Chinese Company DeepSeek Releases DeepSeek-Coder a LLM for Code Generation

Written By: Muhammad Talal Khan Afridi
Last Updated On: February 9, 2024

The barriers to research and development, i.e., inaccessibility to giant closed-source models like Chat GPT and Codex, which worked like shackles towards the progress of the world, have now been broken. The Chinese company, which dreams of the same for AGI, has now released DeepSeek-Coder. An open-source model is available to everyone, making large language models developed for code generation easily and available to everyone.

A sure seismic shift in the rapid technological advancements’ era is bringing a new dawn, and that’s not enough. Keep reading to know more, and later on, let’s explore a quick short demo at the end of this article to get to know more about this seismic shift by DeepSeek-Coder.

Training on 2 Trillion Tokens Makes DeepSeek-Coder Exceptional

The rule book of the closed-source empire was thrown away by the researchers by presenting a range of open-source code modes varying in size from 1.3 B to 33 B, increasing its versatility and applicability.

What’s more impressive is that these bad boys of DeepSeek-Coder are trained from scratch on 2 trillion tokens sourced from 87 different programming languages, helping to mitigate the problem of overfitting, providing solid experience to improve accuracy, making it powerful enough to understand any programming language, and helping you code in any syntax of any programming language.

A large dataset ensures comprehensive coverage of different programming languages, libraries, frameworks, and DeepSeek-Coder has taken great care to give it a solid foundation, increasing its efficiency to assist developers.

Fill in the Blanks Task with a 16K Window

This model breaks the ground by employing a significant advancement in the fill-the-blank task with a 16k window, i.e., the size of the portion it analyses is 16KB.

Together, these two innovatively enhance the code filling and code enhancing capabilities, which allows Deepseek-Coder to seamlessly fill in missing pieces of code, which improves coding practices and achieves a deeper understanding of coding structures.

As a result, it generates state-of-the-art and project-level code, making Deepseek-Coder a perfect pair programmer.

Deepseek-Coder’s Approach to Understanding Human Language

How can generative AI be a success if it’s not able to understand human language? It would be impossible to interact, right? Imagine a modern human teleported back in time and started using any modern language, let’s say English, to communicate with ancient Egyptians instead of hieroglyphic. The same is true here; you can’t communicate. just kidding!

Deepseek-Coder’s approach to understanding human language goes like this:

Incorporation of natural language:
Deepseek-Coder not only comprises source code, but 10% of its training data comprises code-related English, i.e., natural language leveraging materials from platforms like Github’s markdown and stack exchange. This helps with tasks like bug fixing and library usage; furthermore, 3% of it comprises Chinese from high-quality articles, which further improves models proficiency in the Chinese language.
A Novel Approach Using Repository-Level Parsing:
Previous large language models for code were trained primarily on file level, but in Deepseek-Coder, researchers came up with an innovative approach: they parsed the dependencies between files of each repository and then ordered them in a way that ensured the context each file relies on. It helped the model further understand us while working on a project-level code. Previous methods struggled here.
Additional training on 2 billion tokens of DeepSeek-LLM7B:
Now again 2 billion tokens, including code, natural language, and mathematical data, were used to pre-train and improve natural language understanding capabilities. Pre-training was based on DeepSeek-LLM7B.

Superiority of Deepseek-Coder Over Its Alternatives

Superiority of Deepseek-Coder over its alternatives Yeah, as we know, it has so much potential to grow in the future, just like Linux or Android.

But who has seen the future? As we all know, OpenAI once focused on being open source, but now it’s not. Let’s talk about today. As of now, Deepseek-Coder has surpassed its competitors in performance across various benchmarks.

What’s more impressive is that even the smaller model of Deepseek-Coder, like Base 6.7B, has beaten CodelLama-Base 34B, and after fine-tuning, this small model even surpassed GPT 3.5 Turbo in the HumanEval benchmark, significantly reducing the performance gap. It even surpassed GPT 3.5 Turbo in LeetCode and in various other benchmarks.

Experiencing DeepSeek-Coder with a Quick Demo

To access Deepseek-Coder, You can visit their HuggingFace, Github repository, or their official website set up for chatting.

I visited their official website for a quick demo and generated a pin pong game that I wanted to use in Visual Studio code.

Below is the code for Pin-Pong game generated by Deepseek-Coder

import pygame

import sys

# Initialize pygame

pygame.init()

# Constants for the game

SCREEN_WIDTH, SCREEN_HEIGHT = 640, 480

BG_COLOR = (0, 0, 0)  # Black background

PADDLE_WIDTH, PADDLE_HEIGHT = 15, 60

BALL_SIZE = 15

PADDLE_SPEED = 5

BALL_SPEED_X, BALL_SPEED_Y = 3, 3

FPS = 60

# Set up the display

screen = pygame.display.set_mode((SCREEN_WIDTH, SCREEN_HEIGHT))

pygame.display.set_caption("Ping Pong")

# Paddle and ball objects

paddle1 = pygame.Rect(10, (SCREEN_HEIGHT - PADDLE_HEIGHT) // 2, PADDLE_WIDTH, PADDLE_HEIGHT)

paddle2 = pygame.Rect(SCREEN_WIDTH - 10 - PADDLE_WIDTH, (SCREEN_HEIGHT - PADDLE_HEIGHT) // 2, PADDLE_WIDTH, PADDLE_HEIGHT)

ball = pygame.Rect(SCREEN_WIDTH // 2 - BALL_SIZE // 2, SCREEN_HEIGHT // 2 - BALL_SIZE // 2, BALL_SIZE, BALL_SIZE)

# Game loop

def main(ball_speed_x, ball_speed_y):

    clock = pygame.time.Clock()

    running = True

    while running:

        # Handle events

        for event in pygame.event.get():

            if event.type == pygame.QUIT:

                running = False

        # Move the paddles

        keys = pygame.key.get_pressed()

        if keys[pygame.K_w] and paddle1.top > 0:

            paddle1.move_ip(0, -PADDLE_SPEED)

        if keys[pygame.K_s] and paddle1.bottom < SCREEN_HEIGHT:

            paddle1.move_ip(0, PADDLE_SPEED)

        if keys[pygame.K_UP] and paddle2.top > 0:

            paddle2.move_ip(0, -PADDLE_SPEED)

        if keys[pygame.K_DOWN] and paddle2.bottom < SCREEN_HEIGHT:

            paddle2.move_ip(0, PADDLE_SPEED)

        # Move the ball

        ball.move_ip(ball_speed_x, ball_speed_y)

        # Check for collision with top and bottom

        if ball.top <= 0 or ball.bottom >= SCREEN_HEIGHT:

            ball_speed_y *= -1

        # Check for collision with paddles

        if ball.colliderect(paddle1) or ball.colliderect(paddle2):

            ball_speed_x *= -1

        # Check for collision with sides

        if ball.left <= 0 or ball.right >= SCREEN_WIDTH:

            ball.x = SCREEN_WIDTH // 2 - BALL_SIZE // 2

            ball.y = SCREEN_HEIGHT // 2 - BALL_SIZE // 2

        # Draw everything

        screen.fill(BG_COLOR)

        pygame.draw.rect(screen, (255, 255, 255), paddle1)

        pygame.draw.rect(screen, (255, 255, 255), paddle2)

        pygame.draw.ellipse(screen, (255, 255, 255), ball)

        pygame.draw.aaline(screen, (255, 255, 255), (SCREEN_WIDTH // 2, 0), (SCREEN_WIDTH // 2, SCREEN_HEIGHT))

        # Update the display

        pygame.display.flip()
              # Cap the frame rate

Here is a video recording of the Pin Pong game generated entirely using Deepseek-Coder

Because of DeepSeek-Coder, the doors to AGI have opened even wider than before. By making DeepSeek-Coder an open source Chinese company, has opened doors to global collaboration and innovation, bringing ease to education, training, and research and development, a perfect challenge for the closed source platforms.

ML News

Chinese Company DeepSeek Releases DeepSeek-Coder a LLM for Code Generation

Training on 2 Trillion Tokens Makes DeepSeek-Coder Exceptional

Fill in the Blanks Task with a 16K Window

Deepseek-Coder’s Approach to Understanding Human Language

Superiority of Deepseek-Coder Over Its Alternatives

Experiencing DeepSeek-Coder with a Quick Demo

Connect With Us

Signup MLNews Newsletter

What Will You Get?

Bonus

Get A Free Workshop on
AI Development

Chinese Company DeepSeek Releases DeepSeek-Coder a LLM for Code Generation

Training on 2 Trillion Tokens Makes DeepSeek-Coder Exceptional

Fill in the Blanks Task with a 16K Window

Deepseek-Coder’s Approach to Understanding Human Language

Superiority of Deepseek-Coder Over Its Alternatives

Experiencing DeepSeek-Coder with a Quick Demo

Connect With Us

Signup MLNews Newsletter

What Will You Get?

Bonus

Get A Free Workshop on AI Development

Get A Free Workshop on
AI Development