
CORENEXT
SCIENTIFIC PUBLICATIONS
Welcome to the Scientific Publications section of COREnext, a dynamic hub for our scientific and literature stemming from our European project. As pioneers in research and innovation, we delve into the realms of 5G, 6G, cyber security, and trustworthiness, producing a diverse array of publications. Here, you’ll find a comprehensive collection of our scientific papers, conference papers and book chapters. Explore our publications to gain insights into the latest advancements and discoveries shaping the digital landscape. Join us as we share knowledge and drive progress towards a safer and more connected future!
Conflict Management in Vector Register Files
Vector processors’ instruction set architecture works with vectors in a vector register file, which must manage multiple concurrent accesses. High utilization leads to access conflicts, causing performance degradation. Using a software model, we explore the impact and characteristics of these conflicts and methods to manage them: avoidance, resolution, and mitigation. For avoidance, we examine static bank layouts and propose a dynamic one to address their limitations by assigning new registers a unique starting bank. For resolution, we compare arbitration algorithms and optimize round-robin for mixed-width arithmetic by prioritizing wide operands. For mitigation, we study operand queues of varying depths. Our solutions aim to improve vector processors’ area efficiency by allowing shallower operand queues or reducing the number of banks, with a performance impact of 10% or less. These insights can also apply to other shared memory systems.
An Architecture for Shrinking the TCB of TEEs on Heterogeneous Systems
Trusted Execution Environments (TEEs) enable secure code execution on machines that are not fully controlled by the user who runs the code. However, existing TEE solutions do not provide unified support for systems with heteroge neous core architectures or accelerators. Furthermore, their implementation is complex and requires the user to trust (typically closed) firmware in addition to the TEE hardware. Wepropose a heterogeneous TEE architecture with minimal hardware support to reduce the trust in firmware, as well as a minimal Root-of-Trust that enables features such as remote attestation for such TEEs.
Towards adaptive RISC-V based systems for non-terrestrial sub-THz communication
The upcoming 6G communication standard promises unrivaled bandwidth, connectivity, and coverage and will likely span from most remote places over densely populated areas into low earth orbit. The implementation of this vision, however, poses many considerable challenges to the underlying processing hardware with advanced solutions needed to meet these requirements – especially in space. These challenges include the need for significant technological advances, critical demands in terms of performance, reliability, and adaptability, and considerations in terms of the trustworthiness of devices, to name only a few of them. This paper presents our joint efforts to address these needs and enable open-source, adaptive, and fault-tolerant processing systems for 6G communication systems in low-earth orbit.
A novel sparse-connected architecture for multi-user mmWave communication
This paper proposes a sparse analog beamforming (ABF) architecture for multiuser mmWave downlink systems, where each RF chain is connected to a subset of antennas. Beam alignment and RF chain-antenna associations are jointly optimised using a binary association matrix (BAM), with an iterative algorithm addressing the complexity. Simulations show the approach outperforms fully and partially connected architectures, improving SINR and bit error rate while reducing the reliance on precise beam allocation.
Broadband Sub-THz Dielectric Waveguides Characterization
This paper presents the sub-THz characterisation of various plastic fibres for high-speed mmWave communications using cost-effective, broadband waveguide transitions. Transitions from rectangular to circular and circular to plastic waveguides were designed and tested at D-band (110–170 GHz) and H-band (220–330 GHz). Results show insertion losses as low as 1–2 dB and fibre losses between 4 and 20 dB/m, depending on geometry and frequency, with strong alignment between simulations and measurements.
A Transmitter/Receiver Link for High Data Rate Polymer Microwave Fiber Communication at Y-band
A Y-band (170-260 GHz) ultra high data rate transmitter (Tx) and receiver (Rx), are designed and fabricated in a commercial 130 nm silicon germanium (SiGe) BiCMOS process. The link has demonstrated data rates up to 30 Gbps over a one meter polymer microwave fiber (PMF), using a carrier of 237 GHz. This is the first PMF link above 200 GHz reaching a distance of one meter.
D-Band Channel Modelling by 3D Ray Tracing for Joint Communications and Sensing
This paper presents 3D geometrical channel modelling experiments in the D-band frequency and presents the feasibility to develop joint communications and sensing (JCAS) applications in this spectrum. We propose a novel flexible 3D ray tracer for deterministic channel modelling in D-band and its output is benchmarked with existing measurements with quantified differences, showing that the received power deviation is within 2 dB, the delay deviation is within 1 ns and the angle deviation is within 4°• Statistics of the multipath components simulated by the ray tracer are also investigated under different ray tracing configurations, featuring a non-linear relationship versus the scatterer settings, and the output of the ray tracer can be exploited for sensing applications.
A 1024 RV-Cores Shared-L1 Cluster with High Bandwidth Memory Link for Low-Latency 6G-SDR
We introduce an open-source architecture for next-generation Radio-Access Network baseband processing: 1024 latency-tolerant 32-bit RISC-V cores share 4 MiB of L1 memory via an ultra-low latency interconnect (7-11 cycles), a modular Direct Memory Access engine provides an efficient link to a high bandwidth memory, such as HBM2E (98% peak bandwidth at 910GBps). The system achieves leading-edge energy efficiency at sub-ms latency in key 6G baseband processing kernels: Fast Fourier Transform (93 GOPS/W), Beamforming (125 GOPS/W), Channel Estimation (96 GOPS/W), and Linear System Inversion (61 GOPS/W), with only 9% data movement overhead.
Sensitivity Analysis of mmWave Multiuser MIMO with Imperfect Analog Beamforming State Information
This paper analytically examines the impact of imperfect beamforming (BF) information on analog-beamformed multiuser downlink MIMO systems. It derives approximations for average SINR and symbol error probability (SEP) under various BF error distributions and validates them through simulations for both partially and fully connected architectures. Results show that sub-degree BF accuracy, especially at the transmitter, is crucial, particularly as system load and antenna count increase. The study highlights the significant alignment challenges facing future mmWave systems using analog BF.
Twisting Effects on X-Shaped Millimeter-Wave Plastic Waveguides
This paper investigates the impact of twisting on hybrid and mode propagation in X-shaped plastic waveguides, which offer a lightweight, low-cost alternative for high-speed data transmission. Unlike prior studies focusing on bending, this work addresses twisting, both theoretically and experimentally. Results show that in both twisted and twisted-bent configurations, the X-shaped design maintains a stable polarisation direction, highlighting its robustness for applications like data centres and autonomous vehicles.
High-Performance Polymer Microwave Fiber Coupler in eWLB Package for Sub-THz Communication
In this paper, a compact and efficient transceiver integrated circuit (IC) to polymer microwave fiber (PMF) coupler realized in an embedded wafer level ball grid array (eWLB) package is presented for the first time. The proposed solution uses a Vivaldi antenna realized using the redistribution layer of eWLB. The system operates around 140 GHz and achieves a coupling loss of only 4 dB.
The Evolution of Mobile Network Operations: A Comprehensive Analysis of Open RAN Adoption
This paper examines the transformative potential of open RAN (O-RAN) technology for Mobile Network Operators (MNOs) aiming to modernise their infrastructure in response to increasing data demands. It provides a thorough overview of the current state of open RAN research, deployments, and technologies, followed by an analysis of the decision-making roadmap for adoption, covering network design, vendor selection, and implementation strategies. The paper also explores key components, functional splits, and accelerator options, offering practical guidance for MNOs. It concludes by discussing the modular nature of O-RAN and the complexity of its design phase, highlighting both challenges and proposed solutions.
TeraPool-SDR: An 1.89TOPS 1024 RV-Cores 4MiB Shared-L1 Cluster for Next-Generation Open-Source Software-Defined Radios
In this paper, the authors address the increasing demands of 5G and future RAN workloads by presenting Terapool-SDR, a highly efficient processing cluster designed for Software Defined Radio (SDR). The cluster features 1024 processing elements and a fast memory system, achieving high energy efficiency across key 5G tasks, while consuming less than 10W of power. (DOI: https://doi.org/10.1145/3649476.365873)
An Energy-Efficient 56-Gb/s D-Band TX-to-RX Link Using CMOS ICs and Transmitarray Antennas
This letter presents an energy-efficient system featuring multichannel ICs in 45-nm CMOS technology and antennas-in-package, achieving data rates up to 56 Gb/s over a 1-meter link. The system, which uses a channel-aggregation architecture with a large RF bandwidth and a narrow baseband interface, consumes just 33 pJ/bit. (DOI:10.1109/LMWT.2024.3395905)
LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems through Polling-Free and Retry-Free Operation
In this paper, the authors address the issue of polling in shared-memory manycore systems, which leads to contention and inefficiencies. They propose LRwait and SCwait synchronization methods, along with the scalable Colibri implementation, to reduce polling by allowing cores to sleep while waiting. This approach results in a 6.5x improvement in throughput and a 7.1x increase in energy efficiency on a 256-core RISC-V platform, with only a 6% area overhead. (DOI: https://doi.org/10.48550/arXiv.2401.09359)
Towards Disaggregation-Native Data Streaming between Devices, 3rd Workshop on Heterogeneous Composable and Disaggregated Systems (HCDS)
This paper explores the ongoing trend of disaggregation in datacenters, which aims to increase flexibility by connecting pools of CPUs, accelerators, and memory using interconnect technologies like CXL.
(DOI: https://doi.org/10.48550/arXiv.2406.09421)
Core-Local Reasoning and Predictable Cross-Core Communication with M3
This paper delves into enhancing the real-time capabilities of the M³ architecture while maintaining its robust security properties. This research addresses the critical balance between performance and security, offering innovative solutions for advanced system architecture. (DOI: https://doi.org/10.1109/RTAS61025.2024.00024)
Towards Modular Trusted Execution Environments
In this conference workshop the authors propose a modular TEE design. They apply this modular design to the M3 hardware/software co-design platform and demonstrate how TEE support can be made a first-class feature at the system-architecture level.
(DOI: https://doi.org/10.1145/3578359.3593037)
Circularly Polarized Sub-THz Antenna Design for Distributed Deployment
In this paper, the writers propose an antenna-in-package concept for the single-layer substrate on low-cost embedded wafer level ball grid array (eWLB) packages. (DOI: https://doi.org/10.23919/EuCAP60739.2024.10501515)
Distributed Radar Network with Polymer Microwave Fiber (PMF) Based Synchronization
This conference paper presents advancements on distributed radar networks, which provide numerous advantages such as increased angular resolution and improved signal-to-noise ratio. (DOI: https://doi.org/10.1109/WiSNeT59910.2024.10438574)
MinPool: A 16-core NUMA-L1 Memory RISC-V Processor Cluster for Always-on Image Processing in 65nm CMOS
This paper presents MinPool, a low-power image processor for always-on functions implemented in TSMC’s 65 nm technology and based on a tailored MemPool architecture. (DOI: https://doi.org/10.1109/ICECS58634.2023.10382925)
An 80 Gbps QAM-16 PMF Link Using a 130 nm SiGe BiCMOS Process
In this work a D-band (110 GHz – 170 GHz) polymer microwave fiber (PMF) link for high data rate communication is presented.
(DOI: https://doi.org/10.1109/IMS37964.2023.10188207)
A Beyond 100-Gbps Polymer Microwave Fiber Communication Link at D-band
In this work, a D-band (110-170 GHz) ultra high data rate link is presented and characterized.
(DOI: https://doi.org/10.1109/TCSI.2023.3262725)
Disruptive TRX design for D-band
In today’s connected world, the demand for mobile communications and instant access to information, anytime and anywhere, has drastically changed the electronics landscape, both consumer and industrial. This book provides an overview of the latest research results in RF and digital SOI technology development for 5G and 6G, device and substrate characterization, packaging technology, and the realization of full systems including power amplifiers, linearization techniques, beamforming transceivers, access points, and radar detection. (Electronic ISBN:9788770040730)
Software-Defined CPU Modes
CPUs contain a compute instruction set, which regular applications use. This paper explores the question, whether CPU modes could be defined entirely by software. Researchers show how such a design would function and explore the advantages it enables. They believe that pushing all existing modes under a common design umbrella would enforce a cleaner structure and more control over exposed functionality. At the same time, the flexibility of software-defined modes enables interesting new use cases.
(DOI: https://doi.org/10.1145/3593856.3595894)
Dual Vector Load for Improved Pipelining in Vector Processors
Vector processors execute instructions that manipulate vectors of data items using time-division multiplexing (TDM). In this paper, the researchers propose a dual vector load: A parallel or interleaved load of the two input vectors. Their investigation finds that compute-bound and some memory-bound applications profit from this feature when the memory and compute bandwidths are sufficiently high. A speedup of up to 33 % is possible in the ideal case.
(DOI: https://doi.org/10.1109/COOLCHIPS57690.2023.10121996)