K210 introduction

RISC-V (pronounced “risk-five”) is an open-source hardware instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles.
In contrast to most ISAs, the RISC-V ISA is free and open-source and can be used royalty-free for any purpose, permitting anyone to design, manufacture and sell RISC-V chips and software. While not the first open-architecture ISA, it is significant because it is designed to be useful in a wide range of devices. The instruction set also has a substantial body of supporting software, which avoids a usual weakness of new instruction sets.
The project began in 2010 at the University of California, Berkeley, but many contributors are volunteers and industry workers outside the university.
The RISC-V ISA has been designed with small, fast, and low-power real-world implementations in mind, but without over-architecting for a particular microarchitecture style.
As of May 2017, version 2.2 of the userspace ISA is fixed and the privileged ISA is available as draft version 1.10.

Kendryte K210 is dual core 64-bit RISC-V processor.

Here are some features:
First competitive RISC-V chip, also fisrt competitive AI chip, released in Sep. 2018.
Built using advanced 28nm process, dual-core RISC-V 64bit IMAFDC, temperature range -40°C to 125°C
Low voltage, reduced power consumption compared to other systems with the same processing power
3.3V/1.8V dual voltage IO support eliminates need for level shifters
On-chip large 8 MB high-speed SRAM, 400MHz frequency (can be overclocked up to to 800MHz).
KPU (Neural Network Processor) inside, 64 KPU which are 576bit width, supports convolution kernel. Offers 0.25TOPS@0.3W,400MHz, and when you overclock to 800MHz, it offers 0.5TOPS, meaning you can do object recognition 60fps@VGA.
Supports the fixed-point model that the mainstream training framework trains according to specific restriction rules
There is no direct limit on the number of network layers, and each layer of convolutional neural network parameters can be configured separately, including the number of input and output channels, and the input and output line width and column height
Support for 1x1 and 3x3 convolution kernels
Support for any form of activation function
The maximum supported neural network parameter size for real-time work is 5MiB to 5.9MiB
The maximum supported network parameter size when working in non-real time is (flash size - software size)
APU (Audio Processor), support 8 mics, upto 192KHz sample rate, hardware FFT unit
Up to 8 channels of audio input data, ie 4 stereo channels
Simultaneous scanning pre-processing and beamforming for sound sources in up to 16 directions
Supports one active voice stream output
16-bit wide internal audio signal processing
Support for 12-bit, 16-bit, 24-bit, and 32-bit input data widths
Multi-channel direct raw signal output
Up to 192kHz sample rate
Built-in FFT unit supports 512-point FFT of audio data
Uses system DMAC to store output data in system memory
Flexible FPIOA (Programmable IO Array) maximises design flexibility, allows users to map 255 internal functions to 48 free IOs on the chip
Programmable IO function selection
8 drive strength options for outputs
Selectable internal pull-up resistors
Selectable internal pull-down resistors
Schmitt trigger option for inputs
Slew rate control for outputs
Selectable internal input level
DVP camera and MCU LCD interface, you can connect an DVP camera, run your algorithm, and display on LCD.
The DVP is a camera interface module with the following features:
Supports cameras with a DVP interface
Supports camera configuration using SCCB protocol
Maximum frame size 640x480
Supports YUV422 and RGB565 format image input
Can output images to both KPU and display
Output format to KPU: RGB888 or the Y component of YUV422 input
Output format to display: RGB565
Interrupt can be sent to CPU for start-of-frame or completion of frame image transmission

Many other hardware accelerators and peripherals: AES Accelerator, SHA256 Accelerator, FFT Accelerator (not APU’s one), OTP, UART, WDT, IIC, SPI, I2S, TIMER, RTC, PWM, etc.
Machine Vision
Object Detection
Image Classification
Face Detection and Recognition
Obtaining size and coordinates of target in real time
Obtaining type of detected target in real time
Machine Hearing
Sound source orientation detection
Sound Field Imaging
Voice Wake-Up
Speech Recognition
Hybrid Audio/Vision Solution
The Kendryte K210 combines machine vision and machine hearing to provide even more powerful features.
In the application, both the sound source localization and the sound field imaging can be used to assist the machine vision to track the target, and the general target detection can obtain the target’s orientation and then assist the machine to perform the beamforming of the source.
Additionally, the direction of the person can be obtained by the image transmitted from the camera, so that the microphone array is directed to the person by beamforming. At the same time, the direction of speech can be determined according to the microphone array, and the camera is rotated to point to the person.

Architecture Overview:

The K210 includes two 64-bit RISC-V CPU cores, each with a built-in independent FPU.
The primary functions of the K210 are machine vision and hearing, whichincludes the KPU for computing convolutional neural networks and an APU for processing microphone array inputs.
The K210 features a Fast Fourier Transform (FFT) Accelerator for high performance complex FFT calculations. As a result, for most machine learning algorithms, the K210 has high-performance processing power.
The K210 embeds AES and SHA256 algorithm accelerators to provide users with basic security features.
The K210 features high-performance, low-power SRAM and powerful DMA for superior data throughput.
K210 has a wide range of peripheral units: DVP, JTAG, OTP, FPIOA, GPIO, UART, SPI, RTC, I2S, I2C, WDT, Timer and PWM, for a large number of application scenarios.