CORENEXT
SCIENTIFIC PUBLICATIONS
Welcome to the Scientific Publications section of COREnext, a dynamic hub for our scientific and literature stemming from our European project. As pioneers in research and innovation, we delve into the realms of 5G, 6G, cyber security, and trustworthiness, producing a diverse array of publications. Here, you’ll find a comprehensive collection of our scientific papers, conference papers and book chapters. Explore our publications to gain insights into the latest advancements and discoveries shaping the digital landscape. Join us as we share knowledge and drive progress towards a safer and more connected future!
TeraPool-SDR: An 1.89TOPS 1024 RV-Cores 4MiB Shared-L1 Cluster for Next-Generation Open-Source Software-Defined Radios
In this paper, the authors address the increasing demands of 5G and future RAN workloads by presenting Terapool-SDR, a highly efficient processing cluster designed for Software Defined Radio (SDR). The cluster features 1024 processing elements and a fast memory system, achieving high energy efficiency across key 5G tasks, while consuming less than 10W of power. (DOI: https://doi.org/10.1145/3649476.365873)
An Energy-Efficient 56-Gb/s D-Band TX-to-RX Link Using CMOS ICs and Transmitarray Antennas
This letter presents an energy-efficient system featuring multichannel ICs in 45-nm CMOS technology and antennas-in-package, achieving data rates up to 56 Gb/s over a 1-meter link. The system, which uses a channel-aggregation architecture with a large RF bandwidth and a narrow baseband interface, consumes just 33 pJ/bit. (DOI:10.1109/LMWT.2024.3395905)
LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems through Polling-Free and Retry-Free Operation
In this paper, the authors address the issue of polling in shared-memory manycore systems, which leads to contention and inefficiencies. They propose LRwait and SCwait synchronization methods, along with the scalable Colibri implementation, to reduce polling by allowing cores to sleep while waiting. This approach results in a 6.5x improvement in throughput and a 7.1x increase in energy efficiency on a 256-core RISC-V platform, with only a 6% area overhead. (DOI: https://doi.org/10.48550/arXiv.2401.09359)
Towards Disaggregation-Native Data Streaming between Devices, 3rd Workshop on Heterogeneous Composable and Disaggregated Systems (HCDS)
This paper explores the ongoing trend of disaggregation in datacenters, which aims to increase flexibility by connecting pools of CPUs, accelerators, and memory using interconnect technologies like CXL.
(DOI: https://doi.org/10.48550/arXiv.2406.09421)
Core-Local Reasoning and Predictable Cross-Core Communication with M3
This paper delves into enhancing the real-time capabilities of the M³ architecture while maintaining its robust security properties. This research addresses the critical balance between performance and security, offering innovative solutions for advanced system architecture. (DOI: https://doi.org/10.1109/RTAS61025.2024.00024)
Towards Modular Trusted Execution Environments
In this conference workshop the authors propose a modular TEE design. They apply this modular design to the M3 hardware/software co-design platform and demonstrate how TEE support can be made a first-class feature at the system-architecture level.
(DOI: https://doi.org/10.1145/3578359.3593037)
Circularly Polarized Sub-THz Antenna Design for Distributed Deployment
In this paper, the writers propose an antenna-in-package concept for the single-layer substrate on low-cost embedded wafer level ball grid array (eWLB) packages. (DOI: https://doi.org/10.23919/EuCAP60739.2024.10501515)
Distributed Radar Network with Polymer Microwave Fiber (PMF) Based Synchronization
This conference paper presents advancements on distributed radar networks, which provide numerous advantages such as increased angular resolution and improved signal-to-noise ratio. (DOI: https://doi.org/10.1109/WiSNeT59910.2024.10438574)
MinPool: A 16-core NUMA-L1 Memory RISC-V Processor Cluster for Always-on Image Processing in 65nm CMOS
This paper presents MinPool, a low-power image processor for always-on functions implemented in TSMC’s 65 nm technology and based on a tailored MemPool architecture. (DOI: https://doi.org/10.1109/ICECS58634.2023.10382925)
An 80 Gbps QAM-16 PMF Link Using a 130 nm SiGe BiCMOS Process
In this work a D-band (110 GHz – 170 GHz) polymer microwave fiber (PMF) link for high data rate communication is presented.
(DOI: https://doi.org/10.1109/IMS37964.2023.10188207)
A Beyond 100-Gbps Polymer Microwave Fiber Communication Link at D-band
In this work, a D-band (110-170 GHz) ultra high data rate link is presented and characterized.
(DOI: https://doi.org/10.1109/TCSI.2023.3262725)
Disruptive TRX design for D-band
In today’s connected world, the demand for mobile communications and instant access to information, anytime and anywhere, has drastically changed the electronics landscape, both consumer and industrial. This book provides an overview of the latest research results in RF and digital SOI technology development for 5G and 6G, device and substrate characterization, packaging technology, and the realization of full systems including power amplifiers, linearization techniques, beamforming transceivers, access points, and radar detection. (Electronic ISBN:9788770040730)
Software-Defined CPU Modes
CPUs contain a compute instruction set, which regular applications use. This paper explores the question, whether CPU modes could be defined entirely by software. Researchers show how such a design would function and explore the advantages it enables. They believe that pushing all existing modes under a common design umbrella would enforce a cleaner structure and more control over exposed functionality. At the same time, the flexibility of software-defined modes enables interesting new use cases.
(DOI: https://doi.org/10.1145/3593856.3595894)
Dual Vector Load for Improved Pipelining in Vector Processors
Vector processors execute instructions that manipulate vectors of data items using time-division multiplexing (TDM). In this paper, the researchers propose a dual vector load: A parallel or interleaved load of the two input vectors. Their investigation finds that compute-bound and some memory-bound applications profit from this feature when the memory and compute bandwidths are sufficiently high. A speedup of up to 33 % is possible in the ideal case.
(DOI: https://doi.org/10.1109/COOLCHIPS57690.2023.10121996)