2.9 STEP: An 8K-60fps Space-Time Resolution-Enhancement Neural-Network Processor for Next-Generation Display and Streaming
Next-generation display technology is driving ultra-high-definition (UHD) TVs and screens, offering users an immersive experience. However, the scarcity of 8K-UHD streams and the high cost of transmission bandwidth necessitate the use of ISP techniques on terminal displays to enhance video quality....
        Saved in:
      
    
          | Published in | Digest of technical papers - IEEE International Solid-State Circuits Conference Vol. 68; pp. 1 - 3 | 
|---|---|
| Main Authors | , , , , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        16.02.2025
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 2376-8606 | 
| DOI | 10.1109/ISSCC49661.2025.10904700 | 
Cover
| Summary: | Next-generation display technology is driving ultra-high-definition (UHD) TVs and screens, offering users an immersive experience. However, the scarcity of 8K-UHD streams and the high cost of transmission bandwidth necessitate the use of ISP techniques on terminal displays to enhance video quality. Deep-learning algorithms, in particular, can be employed to render stable and vivid videos. The one-stage space-time video super-resolution (STVSR) algorithm [1], depicted in Fig. 2.9.1, is able to simultaneously generate high-resolution and high-frame-rate videos from low-resolution and low-frame-rate input. But rendering 8K-UHD 60fps videos on edge devices with limited computational resources still poses three main challenges. First, although deeper models typically yield better video quality, resource constraints necessitate the use of shallower CNN models, leading to a compromise in image quality. Second, the deformable convolution with modulation (DCM) [2] effectively aligns images across different time points, and multiple DCM layers further improve image quality. However, they require additional on-chip memory to store feature maps (FM) within the layer-fusion (LF) workflow. Third, a large number of PE arrays (e.g., 10K MACs) are required to achieve high-throughput computation, significantly increasing power consumption. | 
|---|---|
| ISSN: | 2376-8606 | 
| DOI: | 10.1109/ISSCC49661.2025.10904700 |