Selasa, 20 Maret 2012

program bahasa c : program untuk mengetahui bilangan tersebut genap atau ganjil


#include <stdio.h>
#include <conio.h>
void main(){
clrscr();
int n;
printf("masukkan nilai : ");scanf("%d",&n);
if(n % 2 == 0){
printf("bilangn genap!");
}
else
{
printf("bilangn ganjil!");
}
getch();
}

bahasa c : program untuk menghitung luas segitiga dan luas lingkaran


#include <stdio.h>
#include <conio.h>
#define pi 3.14
void main(){
clrscr();
int j,a,t;
printf("masukkan jari-jari : ");scanf("%d",&j);
printf("luas lingkaran : %f\n",(pi*j*j));
printf("masukkan alas : ");scanf("%d",&a);
printf("masukkan tinggi : ");scanf("%d",&t);
printf("luas segitiga : %f\n",(0.5*a*t));
getch();
}

materi vektor kelas 10 sma : vektor perkalian dan pembagian


Negatif dari suatu vektor ~A dituliskan sebagai −~A dan didefinisikan sebagai
sebuah vektor dengan besar yang sama dengan besar vektor ~A tetapi
dengan arah yang berlawanan, sehingga ~A + (−1)~A = 0. Dari sini konsep
pengurangan vektor muncul, jadi
~A
− ~B = ~A + (−1)~B.
Aljabar vektor bersifat komutatif dan asosiatif. Jadi ~A + ~B = ~B + ~A, dan
~A
+ (~B + ~C ) = (~A + ~B) + ~C

Dalam ruang berdimensi tiga terdapat paling banyak tiga vektor yang
dapat saling tegak lurus. Vektor-vektor yang saling tegak lurus ini dapat
dijadikan vektor-vektor basis. Dalam sistem koordinat kartesan, sebagai
vektor-vektor basis biasanya diambil vektor-vektor yang mengarah ke arah
sumbu x, y, dan z positif, dan diberi simbol ˆx, ˆy, dan ˆz. Vektor-vektor basis
ini juga dipilih bernilai satu. Sehingga sebarang vektor ~A dalam ruang
dimensi tiga dapat dinyatakan sebagai jumlahan vektor-vektor basis dengan
koefisien-koefisien Ax,Ay,Az yang disebut sebagai komponen vektor dalam
arah basis x, y dan z.
~A
= Axˆx + Ay ˆy + Az ˆz
Dari trigonometri dapat diketahui bahwa bila sudut antara vektor ~A
dengan sumbu x, y, dan z adalah x, y, dan z, maka Ax = Acos x,
Ay = Acos y, dan Az = Acos z, dengan A adalah besar ~A. Dari teorema

Perkalian
Dua buah vektor dapat ‘diperkalikan’. Konsep perkalian antar vektor sangat
bermanfaat dalam perumusan berbagai persamaan-persamaan fisika. Konsep
perkalian dalam vektor sangat berbeda dengan sekedar memperkalian dua
buah bilangan (skalar), dan memiliki definisi tersendiri. Dua buah vektor
dapat diperkalikan menghasilkan sebuah skalar ataupun sebuah vektor baru.
Perkalian yang menghasilkan skalar disebut sebagai perkalian skalar atau
perkalian titik (dot product), dan didefinisikan sebagai
~A
· ~B = AB cos

dapat dinyatakan dalam perumusan berikut ini
C =
q
(~A + ~B ) · (~A + ~B) =
p
A2 + B2 + 2AB cos
Bila ~A dan ~B dinyatakan dalam komponen-komponennya, ~A = Axˆx+Ay ˆy +
Az ˆz dan ~B = Bxˆx + By ˆy + Bz ˆz, maka
~A
· ~B = AxBx + AyBy + AzBz
karena ˆx · ˆy = ˆx · ˆz = ˆy · ˆz = cos 900 = 0 (saling tegak lurus), dan ˆx · ˆx =
ˆy · ˆy = ˆz · ˆz = cos 00 = 1. Dengan mengalikan sebarang vektor ~A dengan
sebuah vektor basis, akan didapatkan proyeksi ~A ke arah vektor basis tadi,
jadi misalnya ~a · ˆx = Ax.
Perkalian dua buah vektor yang menghasilkan sebuah vektor, disebut
sebagai perkalian silang (cross product), untuk dua buah vektor ~A dan ~B

Vektor ~C di sini adalah suatu vektor yang arahnya tegak lurus terhadap
bidang di mana ~A dan ~B berada, dan ditentukan oleh arah putar tangan
kanan yang diputar dari ~A ke ~B . Besar vektor ~C didefinisikan sebagai
C = |~A × ~B | = AB sin
Besar vektor ~C ini dapat diinterpretasikan sebagai luasan jajaran genjang
yang dua sisinya dibatasi oleh ~A dan ~B Sesuai dengan definisinya, maka
~A
× ~B = −~B × ~A. Untuk vektor-vektor basis, diperoleh ˆx× ˆy = ˆz, ˆy× ˆz = ˆx,
ˆz × ˆx = ˆy, dan ˆx × ˆx = ˆy × ˆy = ˆz × ˆz = 0.



materi fisika dasar untuk SMA kelas 10 :Besaran dan Pengukuran


Fisika adalah ilmu yang mempelajari benda-benda serta fenomena dan keadaan
yang terkait dengan benda-benda tersebut. Untuk menggambarkan suatu
fenomena yang terjadi atau dialami suatu benda, maka didefinisikan berbagai
besaran-besaran fisika. Besaran-besaran fisika ini misalnya panjang,
jarak, massa, waktu, gaya, kecepatan, temperatur, intensitas cahaya, dan
sebagainya. Terkadang nama dari besaran-besaran fisika tadi memiliki kesamaan
dengan istilah yang dipakai dalam keseharian, tetapi perlu diperhatikan
bahwa besaran-besaran fisika tersebut tidak selalu memiliki pengertian
yang sama dengan istilah-istilah keseharian. Seperti misalnya istilah

gaya, usaha, dan momentum, yang memiliki makna yang berbeda dalam
keseharian atau dalam bahasa-bahasa sastra. Misalnya, “Anak itu bergaya

di depan kaca”, “Ia berusaha keras menyelesaikan soal ujiannya”, “Momentum
perubahan politik sangat tergantung pada kondisi ekonomi negara”.
Besara-besaran fisika didefinisikan secara khas, sebagai suatu istilah fisika
yang memiliki makna tertentu. Terkadang besaran fisika tersebut hanya
dapat dimengerti dengan menggunakan bahasa matematik, terkadang dapat
diuraikan dengan bahasa sederhana, tetapi selalu terkait dengan pengukuran
(baik langsung maupun tidak langsung). Semua besaran fisika harus dapat
diukur, atau dikuatifikasikan dalam angka-angka. Sesuatu yang tidak dapat
dinyatakan dalam angka-angka bukanlah besaran fisika, dan tidak akan
dapat diukur.
Mengukur adalah membandingakan antara dua hal, biasanya salah satunya
adalah suatu standar yang menjadi alat ukur. Ketika kita mengukur
jarak antara dua titik, kita membandingkan jarak dua titik tersebut dengan
jarak suatu standar panjang, misalnya panjang tongkat meteran. Ketika kita
mengukur berat suatu benda, kita membandingkan berat benda tadi dengan
berat benda standar. Jadi dalam mengukur kita membutuhkan standar sebagai
pembanding besar sesuatu yang akan diukur. Standar tadi kemudian
biasanya dinyatakan memiliki nilai satu dan dijadian sebagai acuan satuan
tertentu. Walau kita dapat sekehendak kita menentukan standar ukur, tetapi
tidak ada artinya bila tidak sama di seluruh dunia, karena itu perlu diadakan
suatu standar internasional. Selain itu standar tersebut haruslah praktis dan
mudah diproduksi ulang di manapun di dunia ini. sistem standar internasional
ini sudah ada, dan sekarang dikenal dengan Sistem Internasional (SI).

Antara besaran fisika yang satu dengan besaran fisika yang lain, mungkin
terdapat hubungan. Hubungan-hubungan antara besaran fisika ini dapat
dinyatakan sebagai persamaan-persamaan fisika, ketika besaran-besaran tadi
dilambangkan dalam simbol-simbol fisika, untuk meringkas penampilan ersamaannya.
Karena besaran-besaran fisika tersebut mungkin saling terkait,
maka tentu ada sejumlah besaran yang mendasari semua besaran fisika yang
ada, yaitu semua besaran-besaran fisika dapat dinyatakan dalam sejumlah
tertentu besaran-besaran fisika, yang disebut sebagai besaran-besaran dasar.
Terdapat tujuh buah besaran dasar fisika (dengan satuannya masing-masing)
1. panjang (meter)
2. massa (kilogram)
3. waktu (sekon)
4. arus listrik (ampere)
5. temperatur (kelvin)
6. jumlah zat (mole)
7. intensitas cahaya (candela)
Satuan SI untuk panjang adalah meter dan satu meter didefinisikan sebagai
1650763,73 kali panjang gelombang cahaya transisi 2p10 - 5d5 isotop Kr86.
Satuan SI untuk waktu adalah sekon dan satu sekon didefinisikan sebagai 9

192 631 770 kali periode transisi tertentu aton Cs133. Satuan SI untuk massa
adalah kilogram, dan satu kilogram didefinisika sebagai massa sebuah silinder
patinum iridium yang disimpan di Lembaga Berat dan Ukuran Internasional
di Prancis. Tetapi selain itu juga terdapat standar massa non SI, yaitu
standar massa atom yang diambil berdasarkan massa satu atom C12 yang
tepat didefinisikan bermassa 12 dalam satuan massa atom terpadu (amu
atomic mass unit, disingkat u).
Besaran-besaran fisika secara umum dapat dikelompokkan menjadi tiga
jenis, besaran skalar, besaran vektor dan besaran tensor. Untuk besaran
tensor, tidak akan dipelajari dalam pelajaran fisika dasar. Besaran skalar
adalah besaran yang memiliki nilai saja, sedangkan besaran vektor adalah
besaran yang selain memiliki nilai juga memiliki arah. Karena konsep tentang
vektor banyak digunakan dalam fisika, maka akan dijelaskan lebih lanjut
secara singkat mengenai besaran vektor ini.
1.2 Vektor
Sebagai contoh yang mudah untuk dipahami dari sebuah vektor adalah vektor
posisi. Untuk menentukan posisi sebuah titik relatif terhadap titik yang
lain, kita harus memiliki sistem koordinat. Dalam ruang berdimensi tiga,
dibutuhkan sistem koordinat, x, y, z untuk mendiskripsikan posisi suatu titik
relatif terhadap suatu titik asal (O). Vektor posisi suatu titik P, relatif terhadap
titik asal digambarkan di bawah ini.

Penjumlahan Vektor
Dari konsep vektor posisi juga dikembangkan konsep penjumlahan vektor.
Vektor posisi titik A adalah ~A, sedangkan posisi titik B ditinjau dari titik A
adalah B. Vektor posisi titik B adalah vektor ~C, dan ~C dapat dinyatakan
sebagai jumlahan vektor ~A dan vektor ~B , ~A + ~B = ~C .




pengertian EWB (Electronic WorkBench) dan tutorialnya untuk pemula

EWB (Electronic WorkBench) adalah salah satu jenis software elektronika yangdigunakan untuk melakukan simulasi terhadap cara kerja dari suatu rangkaian listrik. Perlunyasimulasi rangkaian listrik adalah untuk menguji apakah rangkaian listrik itu dapat berjalandengan baik dan sesuai dengan pendekatan teori yang digunakan pada buku-buku elektronika,tanpa harus membuat rangkaian listrik itu secara nyata. Perlu diingat, simulasi yang dilakukandengan menggunakan EWB adalah simulasi yang menghasilkan keluaran yang ideal.Maksudnya keluaran yang tidak terpengaruh oleh faktor-faktor ketidakidealan seperti gangguan(dikenal dengan noise dalam elektronika) seperti halnya gangguan yang sering terjadi padarangkaian listrik yang sebenarnya (nyata).Penggunaan EWB haruslah didukung oleh pengetahuan dasar tentang elektronika.
Tanpa pengetahuan dasar elektronika yang memadai seperti cara pemakaian alat ukur(osiloskop, multimeter dan lain sebagainya), tentu saja akan lebih sukar untuk memahami carakerja dari software ini. Software ini menggunakan sistem GUI (Graphic User Interface) sepertihalnya Windows sehingga pemakai software yang sudah memahami pengetahuan dasarelektronika akan mudah menguasai penggunaan software ini.Software EWB yang beredar di Indonesia adalah kebanyakan software bajakan (telah di-crack)oleh cracker, usahakan jangan menggunakan software bajakan untuk menyelesaikan proyekbesar yang berhubungan dengan lisensi penggunaan software.
 Cara menginstall EWB 5.12:
Peng-install-an software ini cukup mudah. Cari source (sumber/ file setup) dari EWB5.12 ini, lalu double click pada file setup. Tentukan tempat tujuan EWB diinstall (misalnyaC:\Program Files\ EWB 5.12), lalu klik OK. Tunggu proses instalasi selesai, lalu ke startmenubuka programs-->electronic workbench-->EWB 5.12. EWB siap dipakai.



Penggunaan EWB secara singkat:
Penulis memiliki kemampuan yang terbatas dalam menjelaskan secara detail dari
software ini, jadi dalam modul ini penulis hanya menjelaskan secara singkat pemakaian software
ini. Umumnya, ada tiga hal yang perlu dikuasai oleh pemakai baru EWB yaitu cara pemakaian alat
ukur yang disediakan, pemakaian komponen elektronika (mencakup komponen aktif, pasif dan
sumber sinyal/sumber tegangan) dan pembentukan rangkaian.
Pemakaian alat ukur
Setelah Anda menjalankan EWB, Anda akan melihat tiga toolbar menu (barisan toolbar
file,edit ; toolbar 'gambar' new,open ; dan toolbar komponen dan alat ukur). Pada barisan
terakhir, klik toolbar yang paling kanan. Lalu pilih alat ukur yang ingin dipakai (osiloskop atau
multimeter), drag simbol osiloskop atau multimeter ke bawah (layar putih). Pada simbol
osiloskop ada empat titik kecil yang bisa dipakai yaitu channel A dan B serta dua node ground.
Untuk mengubah time/div dan volt/div seperti yang biasa dilakukan pada osiloskop yang nyata,
klik dua kali simbol osiloskop. Tampilan windows kecil akan muncul dan Anda dapat mengisi
nilai time/div , volt/div yang diinginkan ataupun mengubah hal-hal yang lain. Penggunaan
multimeter juga hampir sama dengan osiloskop. Drag simbol multimeter, klik dua kali untuk
mengubah modus pengukuran (pengukuran arus, tegangan ataupun hambatan).
Pemakaian komponen elektronika
Pada barisan terakhir, mulai dari toolbar 'gambar' yang kedua sampai toolbar 'gambar'
yang ketigabelas adalah toolbar yang berisi simbol komponen. Pada praktikum elektronika dasar
ini, Anda hanya cukup memakai toolbar yang kedua sampai toolbar kelima. Mulai dari toolbar
kedua sampai kelima, ada simbol komponen seperti simbol resistor, kapasitor, dioda, op-amp,
batere, ground, dll. Cara memakai komponen ini hampir sama dengan pemakaian alat ukur.
Untuk mengubah besar nilai komponen dilakukan dengan klik dua kali komponen, lalu isi nilai
komponen yang diinginkan pada tempat yang disediakan.
Penggunaan alat ukur dan komponen untuk lebih detailnya dapat ditanyakan pada asisten
praktikum pada saat praktikum.
(Simbol sinyal generator ada pada toolbar yang paling kanan/ toolbar alat ukur).
Pembentukan rangkaian
Setelah mengambil beberapa komponen yang diinginkan untuk membentuk suatu
rangkaian listrik, Anda perlu menyambung kaki-kaki dari satu simbol ke simbol lainnya.
Penyambungan kaki dapat dilakukan dengan: arahkan mouse pointer ke ujung kaki simbol,
usahakan ujung kaki simbol berwarna terang; lalu klik dan tahan mouse, tujukan ke ujung kaki
simbol yang ingin disambung sampai ujung kaki simbol tersebut berwarna terang dan lepas
mouse. Kedua komponen akan tersambung dengan suatu simbol kawat penghantar. Untuk lebih
jelasnya dapat ditanyakan pada asisten.

Simulasi
Setelah tiga hal tersebut dikuasai, rangkaian listrik sudah dapat dibentuk. Setelah
rangkaian listrik plus alat ukur dipasang pada bagian yang akan diukur (biasanya input dan
output), Anda dapat memulai simulasi dengan menekan simbol saklar yang terletak di pinggir
kanan atas (klik tanda I untuk on simulasi dan klik tanda O untuk off simulasi; tanda pause bisa
juga digunakan terutama untuk mencatat nilai). Usahakan windows kecil alat ukur tetap terbuka,
supaya grafik hasil pengukuran dapat dibaca.
Setelah menguasai tiga langkah dasar dan cara simulasinya, diharapkan Anda dapat
menguasai dasar penggunaan software ini. Untuk menguasai software ini secara detail, Anda
dapat menanyakannya pada Asisten bagian yang belum dimengerti. Semoga Anda tertarik
dengan simulasi rangkaian listrik dengan software Electronic WorkBench (EWB) 5.12 ini.

Minggu, 18 Maret 2012

materi Probabilitas dan Statistika Dasar teori Peluang untuk universitas

Macam-macam Statistika• Statistika DeskripsiMenyajikan data dalam besaran-besaran statistiksehingga mudah diinterpretasikan seperti nilaiminimum, rataan, simpangan baku, median, nilaimaksimum atau menyajikan data-data dalam bentukbentukdiagram.• Statistika InferensiMenggunakan statistika deskripsi untuk menaksir danmenguji besaran statistik.• Data• Percobaan statistik• DataInformasi yang dicatat dan dikumpulkan dalam bentukasli, baik dalam bentuk hitungan maupun pengukuran.• Percobaan statistikPercobaan merupakan suatu proses yang berulangulangdan hasil proses itu tidak dapat diramalkandengan pasti sebelumnya. Percobaan digunakan untukmenghasilkan data mentah.Dasar Teori Peluang• Ruang Sampel• Kejadian dan Operasinya• Menghitung Titik Sampel :– Permutasi– Kombinasi

Kamis, 15 Maret 2012

TEORi probabilitas dan IC DATABOOK

buatyang belum mempunyai materi teori probabilitas dapat mendownload di alamat

download

dan iC Databook

download

DMA to PLB 4 Controller


DMA to PLB4 Controller
This DMA controller provides a DMA interface dedicated to the USB 2.0 device ports and the 128-bit PLB.
Features include:
• 4 independent channels supporting internal USB 2.0 Device endpoints 1 and 2
• Support for memory-to-memory, peripheral-to-memory, and memory-to-peripheral transfers
• Scatter/gather capability
• 128-byte buffer with programmable thresholds
Serial Ports (UART)
Features include:

• Up to four ports in the following combinations:
– One 8-pin
– Two 4-pin
– One 4-pin and two 2-pin
– Four 2-pin
• Selectable internal or external serial clock to allow wide range of baud rates
• Register compatibility with NS16550 register set
• Complete status reporting capability
• Fully programmable serial-interface characteristics
• Supports DMA using internal DMA function on PLB 64
IIC Bus Interface
Features include:
• Two IIC interfaces provided
• Support for Philips® Semiconductors I2C Specification, dated 1995
• Operation at 100kHz or 400kHz
• 8-bit data
• 10- or 7-bit address
• Slave transmitter and receiver
• Master transmitter and receiver
• Multiple bus masters
• Two independent 4 x 1 byte data buffers
• Twelve memory-mapped, fully programmable configuration registers
• One programmable interrupt request signal
• Provides full management of all IIC bus protocols
• Programmable error recovery
• Includes an integrated boot-strap controller (BSC) that is multiplexed with the IIC0 interface

External Peripheral Bus Controller (EBC) The PowerPC 440 EP


External Peripheral Bus Controller (EBC)
Features include:
• Up to six ROM, EPROM, SRAM, Flash memory, and slave peripheral I/O banks supported
• Up to 66.66MHz operation
• Burst and non-burst devices
• 16-bit byte-addressable data bus
• 30-bit address
• Peripheral Device pacing with external “Ready”
• Latch data on Ready, synchronous or asynchronous
• Programmable access timing per device
– 256 Wait States for non-burst

– 32 Burst Wait States for first access and up to 8 Wait States for subsequent accesses
– Programmable CSon, CSoff relative to address
– Programmable OEon, WEon, WEoff (1 to 4 clock cycles) relative to CS
• Programmable address mapping
• External DMA Slave Support
• External master interface
– Write posting from external master
– Read prefetching on PLB for external master reads
– Bursting capable from external master
– Allows external master access to all non-EBC PLB slaves
– External master can control EBC slaves for own access and control
Ethernet Controller Interface
Ethernet support provided by the PPC440EP interfaces to the physical layer but the PHY is not included on the
chip:
• One to two 10/100 interfaces running in full- and half-duplex modes
– One full Media Independent Interface (MII) with 4-bit parallel data transfer
– Two Reduced Media Independent Interfaces (RMII) with 2-bit parallel data transfer
– Two Serial Media Independent Interfaces (SMII)
– Packet reject support
DMA to PLB3 Controller
This DMA controller provides a DMA interface between the OPB and the 64-bit PLB.
Features include:
• Supports the following transfers:
– Memory-to-memory transfers
– Buffered peripheral to memory transfers
– Buffered memory to peripheral transfers
• Four channels
• Scatter/Gather capability for programming multiple DMA operations
• 32-byte buffer
• 8-, 16-, 32-bit peripheral support (OPB and external)
• 32-bit addressing
• Address increment or decrement
• Supports internal and external peripherals
• Support for memory mapped peripherals
• Support for peripherals running on slower frequency buses

PCI Interface of The PowerPC 440 EP


PCI Interface
The PCI interface allows connection of PCI devices to the PowerPC processor and local memory. This interface is designed to Version 2.2 of the PCI Specification and supports 32- bit PCI devices.
Reference Specifications:
• PowerPC CoreConnect Bus (PLB) Specification Version 3.1
• PCI Specification Version 2.2
• PCI Bus Power Management Interface Specification Version 1.1
Features include:
• PCI 2.2
– Frequency to 66MHz
– 32-bit bus
• PCI Host Bus Bridge or an Adapter Device's PCI interface
• Internal PCI arbitration function, supporting up to six external devices, that can be disabled for use with an
external arbiter
• Support for Message Signaled Interrupts
• Simple message passing capability
• Asynchronous to the PLB
• PCI Power Management 1.1
• PCI register set addressable both from on-chip processor and PCI device sides
• Ability to boot from PCI bus memory
• Error tracking/status
• Supports initiation of transfer to the following address spaces:
– Single beat I/O reads and writes
– Single beat and burst memory reads and writes
– Single beat configuration reads and writes (type 0 and type 1)
– Single beat special cycles
DDR SDRAM Memory Controller
The Double Data Rate (DDR) SDRAM memory controller supports industry standard discrete devices. Up to four
256MB logical banks are supported in limited configurations. Global memory timings, address and bank sizes, and
memory addressing modes are programmable.
Features include:
• Registered and non-registered industry standard discrete devices
• 32-bit memory interface with optional 8-bit ECC (SEC/DED)
• Sustainable 1.1GB/s peak bandwidth at 133MHz
• SSTL_2 logic
• 1 to 4 chip selects
• CAS latencies of 2, 2.5 and 3 supported
• DDR200/266 support
• Page mode accesses (up to eight open pages) with configurable paging policy
• Programmable address mapping and timing
• Hardware and software initiated self-refresh
• Power management (self-refresh, suspend, sleep)

Internal Buses The PowerPC 440EP


Internal Buses
The PowerPC 440EP features five standard on-chip buses: two Processor Local Buses (PLBs), two On-Chip Peripheral Buses (OPBs), and the Device Control Register Bus (DCR). The high performance, high bandwidth cores such as the PowerPC 440 processor core, the DDR SDRAM memory controller, and the PCI bridge connect to the PLBs. The primary OPB hosts lower data rate peripherals. The secondary OPB is dedicated to USB 2.0 and DMA. The daisy-chained DCR provides a lower bandwidth path for passing status and control information between

the processor core and the other on-chip cores.
Features include:
• PLB4
– 128-bit implementation of the PLB architecture
– Separate and simultaneous read and write data paths
– 36-bit address
– Simultaneous control, address, and data phases
– Four levels of pipelining
– Byte-enable capability supporting unaligned transfers
– 32- and 64-byte burst transfers
– 133MHz, maximum 4.25GB/s (simultaneous read and write)
– Processor:bus clock ratios of N:1 and N:2
• PLB3
– 64-bit implementation of the PLB architecture
– 32-bit address
– 133MHz (1:1 ratio with PLB 128), maximum 1.1GB/s (no simultaneous read and write)
• OPB (2)
– 32-bit data path
– 32-bit address
– 66.66MHz
• DCR
– 32-bit data path
– 10-bit address

microprocessor: 440EP PowerPC 440EP Embedded Processor


PowerPC® 440 processor core operating up to
667MHz with 32KB I-cache and D-cache with
parity checking.
• Selectable processor:bus clock ratios of N:1, N:2.
• Floating Point Unit with single- and doubleprecision
and single-cycle throughput.
• Dual bridged Processor Local Buses (PLBs) with
64- and 128-bit widths.
• Double Data Rate (DDR) Synchronous DRAM
(SDRAM) interface operating up to 133MHz with
ECC.
• DMA support for external peripherals, internal
UART and memory.
• PCI V2.2 interface (3.3V only). Thirty-two bits at
up to 66MHz.
• Programmable interrupt controller supports
interrupts from a variety of sources.
• Programmable General Purpose Timers (GPT).
• Two Ethernet 10/100Mbps half- or full-duplex
interfaces. Operational modes supported are MII,
RMII, and SMII with packet reject.

• Up to four serial ports (16550 compatible UART).
• Two USB ports. One USB 1.1 Host interface with
on-chip PHY. One USB 2.0 Device interface, with
dedicated DMA, configured as a 1.1 on-chip PHY
or a 2.0 UTMI.
• External peripheral bus (16-bit data) for up to six
devices with external mastering.
• Two IIC interfaces (one with boot parameter read
capability).
• NAND Flash interface.
• SPI interface.
• General Purpose I/O (GPIO) interface.
• JTAG interface for board level testing.
• Boot from PCI memory, NOR Flash on the
external peripheral bus, or NAND Flash on the
NAND Flash interface.
• Available in RoHS compliant lead-free package.

MPC7448 POWERPC PROCESSOR HIGHLIGHTS mikroprosesor



The MPC7448 is the first discrete high- MPC7448 POWERPC® PROCESSOR BLOCK DIAGRAM
performance PowerPC® processor manufactured on 90 nanometer silicon-on-insulator (SOI)
process technology and continues Freescale Semiconductor’s strong legacy of providing PowerPC products with significant processing performance at very low power. The MPC7448 is designed to exceed 1.5 GHz processing performance and offers enhanced power management capabilities. Running at 1.4 GHz, the MPC7448 is expected to use less than 10 watts of power. MPC7448 processors are ideal for leading-edge computing, embedded network control and signal processing applications. Key architectural features include an MPX bus that scales to 200 MHz, 1 MB of on-chip L2 cache with support for Error Correcting Codes (ECC), and full 128-bit implementation of Freescale’s AltiVec™ technology with the added feature of supporting out-of-order transactions. The MPC7448 is pin compatible with Freescale’s MPC7447 and MPC7447A PowerPC products, offering an easy upgrade path to better system performance. Caching In
L2 cache helps keep the PowerPC processor pipeline full, enabling faster and more efficient processing—and the increase in the MPC7448’s L2 cache to 1 MB provides even greater opportunity for performance gains.

     The L2 cache is fully pipelined for two-cycle throughput in the MPC7448. It responds with an 11-cycle
load latency for an L1 miss that hits in L2 with ECC disabled and 12 cycles when ECC is enabled. In the MPC7448, as many as six outstanding cache misses are allowed between the L1 data cache and L2 bus. In addition, the MPC7448 supports a second cacheable store miss. The processors also provide cache locking to the L1 caches so that key performance algorithms and code can be locked in the L1 cache.


MPC7448 POWERPC PROCESSOR HIGHLIGHTS

CPU Speeds (internal) At least 1.5 GHz
Instructions per Clock 4 (3 + Branch)
L1 Cache (integrated) 32 KB instruction, 32 KB data
L2 Cache (integrated) 1 MB with optional ECC
Execution Units Integer(4), Floating-Point, AltiVec(4), Branch, Load/Store
Bus Protocol MPX/60x
Bus Frequency 200 MHz
Bus Interface 64-bit
Package 360 HiCTE BGA
Process Technology 90 nm silicon-on-insulator (SOI), Multi-Vt, Triple Gate Oxide,
Low-K Dielectric, 10 Year Reliability at 105°C



Compatibility and Support The MPC7448 can be a drop-in upgrade for MPC7447 and MPC7447A
processors because it is pin-for-pin compatible. In addition, as with all PowerPC processors, the MPC7448 is fully software compatible with the MPC7xxx family of processors. The Freescale family of PowerPC processors continues to enjoy the support of a broad set of operating systems, compilers and development tools from third-party vendors.


arsitektur , bus dan fungsi mikroprosesor dan mokrocontroler


Setiap komputer yang kita gunakan didalamnya pasti terdapat mikroprosesor. Mikroprosesor,
dikenal juga dengan sebutan CentralProcessing Unit (CPU) artinya unit pengolahan pusat.
CPU adalah pusat dari proses perhitungan dan pengolahan datayang  terbuat  dari  sebuah  lempengan yang disebut "chip“. Chip  sering disebut juga dengan "IntegratedCircuit (IC)",  bentuknya  kecil, terbuat  dari  lempengan silikon dan bisa terdiridari 10 juta transistor.
Mikroprosesor pertama adalah intel 4004yang dikenalkan tahun1971,  tetapi kegunaan  mikroprosesor ini masih sangat terbatas, hanya dapat digunakan untuk operasi penambahan dan pengurangan.Mikroprosesor pertama yang digunakan untuk komputer di rumah adalah intel 8080, merupakan komputer 8bit dalam satu chip yang diperkenalkan pada tahun 1974.Tahun 1979 diperkenalkanmikroprosesor baru yaitu 8088. Mikroprosesor 8088 mengalami perkembangan menjadi 80286, berkembang lagi menjadi 80486, kemudian menjadi Pentium, dari Pentium I sampai dengan sekarang,Pentium IV.


Transistor berbentuk seperti tabung yang sangat kecil, terdapatpada Chip.
Micron adalah ukuran dalam Micron (10 pangkat -6), merupakan kabelterkecil dalam Chip
Clock Speed = kecepatan maksimal sebuah prosesor
Data width = lebar dari Arithmatic Logic Unit (ALU) / Unitpengelola aritmatika, untuk proses pengurangan, pembagian, perkalian dansebagainya.
MIPS = Millions of Instructions Per Second / Jutaan perintah perdetik.

fungsi pin pada mikroprosesor
n  AD15-AD0 Sebagia addressmultiplexer dimana (ALE=1) /data bus(ALE=0).
n  A19/S6-A16/S3(multiplexed) Sebagai 4 bit terakhir dengan 4 bits dari 20-bit address A16 s/dA19 Atau status bits S6- S3.
n  M/IO Sebagai indikasi apakahalamar memory atau alamat Input Output.
n  RD Ketika 0, data busmenujukan pembacaraan dari memory atau dari I/O device.
n  WR Berfungsi kepadamikroproses untuk menunjuk ke memory atau I/O device melalui data bus. Jika 0,maka data bus telah valid data.
n  ALE (Address latchenable) Ketika 1, address data bus melakukan penulisan pada memory atauI/O address.
n  DT/R (DataTransmit/Receive) Data bus sebagai transmitting/receiving data.
n  DEN (Data bus Enable) mengerakkan data bus diluar buffer.
n  S7: Logic 1, S6: Logic0.
n  S5: Jika tidak ada flagbits, dimana hanya untuk alamat yang sesuai denngan kondisinya
n  S4-S3: Memberikan status padasegment saat akses selama mengunakan power.
n  S2, S1, S0: Mengindikasi fungsi buscycle (decoded by 8288).

CONT.
n  INTR (Interrupt Request)Ketika INTR=1 dan IF=1, maka mikroprosesor menyediakannya service interrupt.INTA kembali aktif seletah intruksinya lengkap.
n  INTA (InterruptAcknowledge) mikroprosesor merespon pada INTA. Karena tabel vektor dapattepisah dan akan menuju data bus.
n  NMI (Non-maskableinterrupt) Fungsi seperti INTR, Jika flag bit tidak disetujui, dan jugaberfungsi sebagai intrupsi pada vektor 2.
n  CLK (Clock) inputmempunyai duty cycle of 33% (high for 1/3 and low for 2/3s)
n  VCC/GND Power supply(5V) and GND (0V).
n  MN/ MX untuk modeminimum (5V) atau mode maximum (0V) secara operasi.
n  BHE (Bus High Enable). Mengaktifkansebagian data bus yang sangat penting (D15 -D 8 ) selama operasi pembacaan danpenulisan.
n  READY melakukan prosestunggu yang telah ditetapkan (pengontrolan memori dan I/O pada proses pembacaanatau penulisan) oleh mikroprosesor.







 

Rabu, 14 Maret 2012

MPC7448


The MPC7448 is the first discrete high- MPC7448 POWERPC® PROCESSOR BLOCK DIAGRAM
performance PowerPC® processor manufactured
on 90 nanometer silicon-on-insulator (SOI)
process technology and continues Freescale
Semiconductor’s strong legacy of providing
PowerPC products with significant processing
performance at very low power. The MPC7448
is designed to exceed 1.5 GHz processing
performance and offers enhanced power
management capabilities. Running at 1.4 GHz,
the MPC7448 is expected to use less than
10 watts of power. MPC7448 processors are
ideal for leading-edge computing, embedded
network control and signal processing
applications.

Key architectural features include an MPX bus that
scales to 200 MHz, 1 MB of on-chip L2 cache
with support for Error Correcting Codes (ECC),
and full 128-bit implementation of Freescale’s
AltiVec™ technology with the added feature of
supporting out-of-order transactions. The MPC7448
is pin compatible with Freescale’s MPC7447 and
MPC7447A PowerPC products, offering an easy
upgrade path to better system performance.
Caching In
L2 cache helps keep the PowerPC
processor pipeline full, enabling faster and
more efficient processing—and the increase
in the MPC7448’s L2 cache to 1 MB
provides even greater opportunity for
performance gains. The L2 cache is fully
pipelined for two-cycle throughput in the
MPC7448. It responds with an 11-cycle
load latency for an L1 miss that hits in L2
with ECC disabled and 12 cycles when
ECC is enabled. In the MPC7448, as many
as six outstanding cache misses are
allowed between the L1 data cache and L2
bus. In addition, the MPC7448 supports a
second cacheable store miss. The
processors also provide cache locking to
the L1 caches so that key performance
algorithms and code can be locked in the
L1 cache.
MPC7448
PowerPC® Processor
High-Performance Processors
Compatibility and Support
The MPC7448 can be a drop-in
upgrade for MPC7447 and MPC7447A
processors because it is pin-for-pin compatible.
In addition, as with all PowerPC processors, the
MPC7448 is fully software compatible with the
MPC7xxx family of processors. The Freescale
family of PowerPC processors continues to
enjoy the support of a broad set of operating
systems, compilers and development tools
from third-party vendors.
Power Management
Continuing to pursue lower and lower power
consumption is a keen focus with the
Freescale family of PowerPC processors, and
the MPC7448 is no exception. Power
management features include:
> Expanded Dynamic Frequency Switching
(DFS) capability enabling improved power
savings (divide-by-two and divide-by-four
modes are provided)
> Voltage scales down to 0.9V
> Added benefits of 90 nanometer
technology include:
• Multi-Vt and triple gate oxide integrated
transistors for low standby power
• Low-K dielectric for high performance
with reduced power and noise
> Temperature sensing diodes included to
monitor die temperature
Superscaler Core
The MPC7448 processor features a
high-frequency superscalar e600 PowerPC
core*, capable of issuing four instructions per
clock cycle (three instructions plus one
branch) into
11 independent execution units:
> Four integer units (three simple plus one
complex)
> Double-precision floating point unit
> Four AltiVec technology units (simple,
complex, floating and permute)
> Load/store unit
> Branch processing unit
AltiVec Acceleration
The MPC7448 includes the same powerful
128-bit AltiVec vector execution unit as found
in previous MPC7xxx devices. AltiVec
technology may dramatically enhance the
performance of applications such as voiceover-
Internet Protocol (VoIP), speech
recognition, multi-channel modems, virtual
private network servers, high-resolution 3-D
graphics, motion video (MPEG-2, MPEG-4),
high fidelity audio (3-D audio, AC-3), and so
on. AltiVec computational instructions are
executed in the four independent, pipelined
AltiVec execution units. A maximum of two
AltiVec instructions can be issued in order to
any combination of AltiVec execution units per
clock cycle. In the MPC7448, a maximum of
two AltiVec instructions can be issued out-oforder
to any combination of AltiVec execution
units per clock cycle from the bottom two
AltiVec instruction queue entries. For example,
an instruction in queue one destined for
AltiVec integer unit one does not have to wait
for an instruction in queue zero that is stalled
behind an instruction waiting for operand
availability.
Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. The PowerPC name is a trademark of IBM Corp. and used under license.
All other product or service names are the property of their respective owners.
© Freescale Semiconductor, Inc. 2005
Document Number: MPC7448POWPCFS
REV 1
Learn More: For more information about Freescale products, please visit www.freescale.com.
CPU Speeds (internal) At least 1.5 GHz
Instructions per Clock 4 (3 + Branch)
L1 Cache (integrated) 32 KB instruction, 32 KB data
L2 Cache (integrated) 1 MB with optional ECC
Execution Units Integer(4), Floating-Point, AltiVec(4), Branch, Load/Store
Bus Protocol MPX/60x
Bus Frequency 200 MHz
Bus Interface 64-bit
Package 360 HiCTE BGA
Process Technology 90 nm silicon-on-insulator (SOI), Multi-Vt, Triple Gate Oxide,
Low-K Dielectric, 10 Year Reliability at 105°C
*Note: The e600 PowerPC core is identical to the G4 core in previous 7xxx products.
MPC7448 POWERPC PROCESSOR HIGHLIGHTS

Completion
Unit
Instruction Fetch
Branch Unit
Dispatch Unit
32 KB
Instruction
Cache
Sequencer Unit
BHT/
BTIC
AltiVec Issue GPR Issue FPR Issue
CFX SFXO SFX1 SFX2
GPRs
Rename
Buffers
LSU
FPRs
Rename
Buffers
FPU
FLOAT
COMPLEX
SIMPLE
PERMUTE
VRs
Rename
Buffers
32 KB
Data
Cache
AltiVec Engine
Interface
to Memory
Sub-System
1 MB Unified L2 Cache
System Interface Unit
60x/MPX Bus Interface
TM

ARM Microcontroller Code Size Analysis


ARM Microcontroller Code Size Analysis | Overview 1
32‐Bit Microcontroller Code Size
Analysis
Draft 1.2.4. Joseph Yiu, Andrew Frame
Overview
Microcontroller application program code size can directly affect the cost and power consumption of
products therefore it is almost always viewed as an important factor in the selection of a
microcontroller for embedded projects. Since the release and availability of 32‐bit processors such
as the ARM Cortex‐M3, more and more microcontroller users have discovered the benefits of
switching to 32‐bit products – lower power, greater energy efficiency, smaller code size and much
better performance. Whilst most of the benefits of using 32‐bit microcontrollers are widely known,
the code size advantage of 32‐bit microcontrollers is less obvious.
In this article we will explain why 32‐bit microcontrollers can reduce application code size whilst still
achieving high system performance and ease of use.
Typical myths of program size
Myth #1: 8bit

and 16bit
microcontrollers have smaller code size
There is a common misconception that switching from an 8‐bit microcontroller to a 32‐bit
microcontroller will result in much bigger code size – why? Many people have the impression that 8‐
bit microcontrollers use 8‐bit instructions and 32‐bit microcontrollers use 32‐bit instructions. This
impression is often reinforced by slightly misleading marketing from the 8‐bit and 16‐bit
microcontroller vendors.
In reality, many instructions in 8‐bit microcontrollers are 16‐bit, 24‐bits or other sizes larger than 8‐
bit, for example, the PIC18 instruction sizes are 16‐bit and, with the 8051 architecture, although
some instructions are 1 byte long, many others are 2 or 3 bytes long.
So would code size be better moving to a 16‐bit microcontroller? Not necessarily. Taking the
MSP430 as an example, a single operand instruction can take 4 bytes (32 bits) and a double operand
instruction can take 6 bytes (48 bits). In the worst case, an extended immediate/index instruction in
MSP430X can take 8 bytes (64 bits).
So how about the code size for ARM Cortex microcontrollers? The ARM Cortex‐M3 and Cortex‐M0
processors are based on Thumb®‐2 technology, which provides excellent code density. Thumb‐2
microcontrollers have 16‐bit instructions as well as 32‐bit instructions, with the 32‐bit instruction
functionality a superset of the 16‐bit version. In most cases a C compiler will use the 16‐bit version
of the instruction. The 32‐bit version would only be used when the operation cannot be performed
ARM Microcontroller Code Size Analysis | Typical myths of program size 2
with a 16‐bit instruction. As a result, most of the instructions in an ARM Cortex microcontroller
program are 16‐bits. That’s even smaller than some of the instructions in 8‐bit microcontrollers.
Instruction size
8051
Min
Max
Number of
bits
PIC18 MSP430 /
MSP430X
Min
Max
ARM
Min
Max
PIC24
16
32
48
64
Figure 1: Size of a single instruction in various processors
Within a compiled program for Cortex‐M processors, the number of 32‐bit instructions can be only a
small portion of the total instruction count. For example, the amount of 32‐bit instructions in the
Dhrystone program image is only 15.8% of the total instruction count (average instruction size is
18.53 bits) when compiled for the Cortex‐M3. For the Cortex‐M0 the ratio of 32‐bit instructions is
even lower at 5.4% (average instruction size 16.9 bits).
Myth #2: My application only processes 8bit
data and 16bit
data
Many embedded developers think that if their application only processes 8‐bit data then there is no
benefit in switching to a 32‐bit microcontroller. However, looking into the output from the C
compiler carefully, in most cases the humble “integer” data type is actually 16‐bits. So when you
have a for loop with an integer as loop index, comparing a value to an integer value, or using a C
library function that uses an integer (e.g. memcpy()), you are actually using 16‐bit or larger data.
This can affect code size and performance in various ways:
• For each integer computation, an 8‐bit processor will need multiple instructions to carry out
the operations. This directly increases the code size and the clock cycle count.
• If the integer value has to be saved into memory, or if you need to load an immediate value
from program ROM to this integer, it will take multiple instructions and multiple clock cycles.
• Since an integer can take up two 8‐bit registers, more registers are required to hold the
same number of integer variables. When there are an insufficient number of registers in the
register bank to hold local variables, some have to be stored in memory. Thus an 8‐bit
microcontroller might result in more memory accesses which increases code size and
reduces performance and power efficiency. The same issue applies to the processing of 32‐
bit data on 16‐bit microcontrollers.
ARM Microcontroller Code Size Analysis | Typical myths of program size 3
• Since more registers are required to hold an integer in an 8‐bit microcontroller when passing
variables to a function via the stack, or saving register contents during context switching or
interrupt servicing, the number of stack operations required is more than that of 32‐bit
microcontrollers. This increases the program size, and can also affect interrupt latency
because an Interrupt Service Routine (ISR) must make sure that all registers used are saved
at ISR entry and restored at ISR exit. The same issue applies to the processing of 32‐bit data
on 16‐bit microcontrollers.
There is even more bad news for 8‐bit microcontroller users: memory address pointers take multiple
bytes so data processing involving the use of pointers can therefore be extremely inefficient.
Myth #3: A 32bit
processor is not efficient at handling 8bit
and 16bit
data
Most 32‐bit processors are actually very efficient at handling 8‐bit and 16‐bit data. Compact
memory access instructions for signed and unsigned 8‐bit, 16‐bit and 32‐bit data are all available.
There are also a number of instructions specially included for data type conversions. Overall the
handling of 8‐bit and 16‐bit data in 32‐bit processors such as the ARM Cortex microcontrollers is just
as easy and efficient as handling 32‐bit data.
Myth #4: C libraries for ARM processors are too big
There are various C library options for ARM processors. For microcontroller applications, a number
of compiler vendors have developed C libraries with a much smaller footprint. For example, the
ARM development tools have a smaller version of the C library called MicroLib. These C libraries are
especially designed for microcontrollers and allow application code size to be small and efficient.
Myth #5: Interrupt handling on ARM microcontrollers is more complex
On the ARM Cortex microcontrollers the interrupt service routines are just normal C subroutines.
Vectored or nested interrupts are supported by the Nested Vectored Interrupt Controller (NVIC)
with no need for software intervention. In fact the setup process and processing of an interrupt
request is much simpler than 8‐bit and 16‐bit microcontrollers, as generally you only need to
program the priority level of an interrupt and then enable it.
The interrupt vectors are stored in a vector table in the beginning of the memory, normally within
the flash, without the need for any software programming steps. When an interrupt request takes
place the processor automatically fetches the corresponding interrupt vector and starts to execute
the ISR. Some of the registers are pushed to the stack by a hardware sequence and restored
automatically when the interrupt handler exits. The other registers that are not covered by the
hardware stacking sequence are pushed onto the stack by C compiler‐generated code only if the
register is used and modified within the ISR.
ARM Microcontroller Code Size Analysis | Typical myths of program size 4
What about moving to 16bit
microcontrollers?
16‐bit microcontrollers can be efficient in handling 16‐bit integers and 8‐bit data (e.g. strings)
however the code size is still not as optimal as using 32‐bit processors:
‐ Handling of 32‐bit data: if the application requires handling of any long integer (32‐bit) or
floating point types then the efficiency of 16‐bit processors is greatly reduced because
multiple instructions are required for each processing operation, as well as data transfers
between the processor and the memory.
‐ Register usage: When processing 32‐bit data, 16‐bit processors requires two registers to
hold each 32‐bit variable. This reduces the number of variables that can be held in the
register bank, hence reducing processing speed as well as increasing stack operations and
memory accesses.
‐ Memory addressing mode: Many 16‐bit architectures provide only basic addressing modes
similar to 8‐bit architectures. As a result, the code density is poor when they are used in
applications that require processing of complex data sets.
‐ 64K bytes limitation: Many 16‐bit processors are limited to 64K bytes of addressable
memory reducing the functionality of the application. Some 16‐bit architectures have
extensions to allow more than 64K bytes of memory to be accessed, however, these
extensions have an instruction code and clock cycle overhead, for example, a memory
pointer would be larger than 16‐bits and might require multiple instructions and multiple
registers to process it.
ARM Microcontroller Code Size Analysis | Instruction Set efficiency 5
Instruction Set efficiency
When customers port their applications from 8‐bit architecture to ARM Cortex microcontrollers,
they very often find that the total code has dramatically decreased. For example, when Melfas (a
leading company in capacitive sensing touch screen controllers) evaluated the Cortex‐M0 processor,
they found that the Cortex‐M0 program size was less than half of that of the 8051 and, at the same
time, delivered five times more performance at the same clock frequency. This, for example, could
enable them to run the application at 1/5 clock speed of the equivalent 8051 product, reducing the
power consumption, and lowering product cost at the same time due to a smaller program flash size
requirements.
So how does ARM architecture provide such big advantages? The key factor is Thumb‐2 technology
which provides a highly efficient unified instruction set.
Powerful Addressing mode
The ARM Cortex microcontrollers support a number of addressing modes for memory transfer
instructions. For example:
‐ Immediate offset (Address = Register value + offset)
‐ Register offset ((Address = Register value 1 + shifted(Register value 2))
‐ PC related (Address = Current PC value + offset)
‐ Stack pointer related (Address = SP + offset)
‐ Multiple register load and store, with optional automatic base address update
‐ PUSH/POP instructions with multiple registers
As a result of these various addressing modes, data transfer between registers and memory can be
handled with fewer instructions. Since the PUSH and POP instructions support multiple registers, in
most cases, saving and restoring of registers in a function call will only need one PUSH in the
beginning of function and one POP at the end of the function. The POP can even be combined with
the return instruction at the end of function to further reduce the instruction count.
Conditional branches
Almost all processors provide conditional branch instructions however ARM processors provide
improved conditional branching by having separated branch conditions for signed and unsigned data
operation results, and providing a good branch range.
For example, when comparing the conditional branches of the Cortex‐M0 and MSP430, the Cortex‐
M0 has more branch conditions available, making it possible to generate more compact code no
matter whether the data being process is signed or unsigned. The MSP430 conditional branches
might require multiple instructions to get the same operations.
ARM Microcontroller Code Size Analysis | Instruction Set efficiency 6
Generally the same situation applies to many 8‐bit or 16‐bit microcontrollers ‐ when dealing with
signed data, additional steps might also be required in the conditional branch.
In addition to the branch instructions available in the Cortex‐M0, the Cortex‐M3 processor also
supports compare‐and‐branch instructions (CBZ and CBNZ). This further simplifies some of the
conditional branch instruction sequence.
Conditional Execution
Another area that allows the ARM Cortex‐M3 microcontrollers to have more compact code is the
conditional execution feature. The Cortex‐M3 supports an instruction called IT (IF‐THEN). This
instruction allows up to 4 subsequent instructions to be conditionally executed reducing the need
for additional branches. For example,
if (xpos1 < xpos2) { x1 = xpos1;
x2 = xpos2;
} else {
x1 = xpos2;
x2 = xpos1;
This can be converted to the following assembly code (needs 12 bytes in the Cortex‐M3):
CMP R0, R1
ITTEE CC ; if unsigned “<”
MOVCC R2, R0
MOVCC R3, R1
MOVCS R3, R0
MOVCS R2, R1
Other architectures might need an additional branch (e.g. needs 14 bytes in MSP430):
CMP.W R14, R13
JGE Label1 ; if unsigned “<”
ARM Microcontroller Code Size Analysis | Instruction Set efficiency 7
MOV.W R11, R14
MOV.W R12, R13
JMP Label2
Label1
MOV.W R11, R13
MOV.W R12, R14
Label2
This results in an extra two bytes for the MSP430 when compared to Cortex‐M3.
Multiply and Divide
Both the Cortex‐M0 and Cortex‐M3 processors support single cycle multiply operations. The Cortex‐
M3 also has multiply and multiply‐accumulate instructions for 32‐bit or 64‐bit results. These
instructions greatly reduce the code size required when handling multiplication of large variables.
Most other 8‐bit and 16‐bit microcontrollers also have multiply instructions however the limitation
of the register size often means that the multiplication requires multiple steps, if the result needs to
be more than 8 or 16 bits.
The MSP430 does not have multiply instruction (MSP430 document slaa329, reference 1). To carry
out multiplication either a memory mapped hardware multiplier is used, or the multiply operation
has to be handled by software using add and shift. Even if a hardware multiplier is present the
memory mapped nature of the multiplier results in the additional overhead of transferring data to
and from the external hardware. In addition, using the multiplier within an interrupt handler could
cause existing data in the multiplier to be lost. As a result, interrupts are usually disabled before a
multiply operation and the interrupt is re‐enabled after multiplication is completed. This adds
additional software overhead and affects interrupt latency and determinism.
The Cortex‐M3 processor also has unsigned and signed integer divide instructions. This reduces the
code size required in applications that need to perform integer division because there is no need for
the C library to include a function for handling divide operations.
Powerful instruction set
In additional to the standard data processing, memory access and program control instructions, the
Cortex microcontrollers also support a number of other instructions to help data type conversion.
The Cortex‐M3 processor also supports a number of bit field operations reducing the software
overhead in, for example, peripheral control and communication data processing.
ARM Microcontroller Code Size Analysis | Breaking the 64K byte memory barrier 8
Breaking the 64K byte memory barrier
As already mentioned, many 8‐bit and 16‐bit microcontrollers are limited to 64k bytes addressable
memory. Due to the nature of 8‐bit and 16‐bit microcontroller architecture, the coding efficiency of
these microcontrollers often decreases dramatically when the application exceeds the 64k byte
memory barrier. In 8‐bit and 16‐bit microcontrollers (e.g. 8051, PIC24, C166) this is often handled by
memory bank switching or memory segmentation with the switching code generated automatically
by the C compilers. Every time a function or data in a different memory page is required bank
switching code would be needed and hence further increases the program size.
Figure 2: Increase code size overhead of memory bank switching or segmentation in 8‐bit and 16‐bit
systems
The memory bank switching not only creates larger code but it also greatly reduces the performance
of a system. This is especially the case if the data being processed is on different memory bank (e.g.
copying a block of data from one page to another page can be very costly in terms of performance.)
This is particularly inefficient for 8‐bit microcontrollers like the 8051 because the MCS‐51
architecture does not have proper support for such a memory bank switching feature. Therefore
memory switching has to be carried out by saving and updating memory bank control like
I/O port registers. In addition, the memory page switching code usually has to be carried out
in a congested shared memory space with limited size. At the same time some of the
memory pages might not be fully utilized and memory space is wasted.
For the 8‐bit and 16‐bit microcontrollers that support memory of over 64k this often comes at a
price. The MSP430X design overcomes the 64K bytes memory barrier by increasing the Program
Counter (PC) and register width to 20‐bits. Despite no memory paging being involved, the sizes of
some MSP430X instructions are considerably larger than the original MSP430. For example, when
the large memory model is used, a double operand formatted instruction can take 8 bytes rather
than 6 (a 33% increases):
ARM Microcontroller Code Size Analysis | Examples 9
Op-code
15 12 11 8
Rsrc Ad
7
B/W
6
As
5
Rdst
4 3 0
Source or destination 15:0
Destination 15:0
MSP430 Double
Operand
intruction
Op-code
15 12 11 8
Rsrc Ad
7
B/W
6
As
5
Rdst
4 3 0
Source or destination 15:0
Destination 15:0
MSP430X
Double Operand
intruction
00011 Source 19:16 Destination
A/L Rsrv 19:16
Figure 3: Support of larger memory system increases the size of some instructions in MSP430X
Apart from the size of the instruction itself, the use of the 20‐bit addressing also increases the
number of stack operations required. Since the memory is only 16‐bit, the saving of a 20‐bit address
pointer will need two stack push operations, resulting in extra instructions and poor utilization of the
stack memory.
Figure 4: Use of large memory data model in MSP430X increases code size
As a result, an MSP430X application has a lower code density when the large memory model is used,
which is required when the address range exceeds the 64k range.
In ARM Cortex microcontrollers, 32‐bit linear addressing is used to provide 4GB of memory space for
embedded applications. Therefore there is no paging overhead and the programming model is easy
to use.
ARM Microcontroller Code Size Analysis | Examples 10
Examples
To demonstrate the code size compared to 8‐bit and 16‐bit processors, a number of test cases are
compiled and illustrated here. The tests are based on “MSP430 Competitive Benchmark” document
from Texas instruments (SLAA205C, reference 2). The results listed here show total program
memory size in bytes.
MSP430 results:
The tests listed are compiled using IAR Embedded Workbench 4.20.1 with hardware
multipler enabled, optimization level set to “High” with “Size” optimization. Unless specified,
the “Small” data model is used and type double is 32‐bit. The results are obtained at linker
output report (CODE+CONST).
ARM Cortex processor results:
The tests listed are compiled using RealView Development Suite 4.0‐SP2. Optimization level
is 3 for size, minimal vector table, and MicroLIB is used. The results are obtained at linker
output report (VECTORS + CODE).
Test Generic
MSP430
MSP430F5438 MSP430F5438
large data
model
CortexM3
Math8bit 198 198 202 144
Math16bit 144 144 144 144
Math32bit 256 244 256 120
MathFloat 1122 1122 1162 600
Matrix2dim8bit 180 178 196 184
Matrix2dim16bit 268 246 290 256
Matrixmult 276 228 (linker error) 228
Switch8bit 200 218 218 160
Switch16bit 198 218 218 160
Firfilter(Note 1) 1202 1170 1222 716 (820
without
modification)
Dhry 923 893 1079 900
Whet(Note 2) 6434 6308 6614 4384(8496
without
modification)
ARM Microcontroller Code Size Analysis | Examples 11
Note 1: The constant data array in the Firfilter test is modified to use 16‐bit data type on the Cortex‐
M processor (const unsigned short int INPUT[]).
Note 2: When certain math functions are used (sin, cos, atan,sqrt, exp, log) in the ARM C standard
the double precision libraries are used by default. This can result in significantly larger program size
unless adjustments are made. In order to achieve an equivalent comparison, the program code is
edited so that single precision versions are used (sinf, cosf, atanf, sqrtd, expf, logf). Also, some of
the constant definitions have been adjusted to single precision (e.g. 1.0 becomes 1.0F).
Figure 5 : Code size comparison for basic operations
The total size for simple tests (integer math, matrix and switch tests) are:
Summary for simple
tests
Generic MSP430 MSP430F5438 Cortex‐M3
Total size (bytes) 1720 1674 1396
Advantage (% smaller) ‐ 2.6% 18.8%
For applications using floating point, there us a signicant advantage for Cortex microcontrollers.,
whereas Dhrystone program size is closer.
ARM Microcontroller Code Size Analysis | Examples 12
Figure 6: Code size comparison for floating point operations and benchmark suites
The total size for benchmark and floating point tests (Dhrystone, Whetstone, Firfilter and MathFloat)
are:
Summary for simple
tests
Generic MSP430 MSP430F5438 Cortex‐M3
Total size (bytes) 9681 9493 6600
Advantage (% smaller) ‐ 1.9% 31.8%
Observations:
1. From the results, we can see that the Cortex microcontrollers have better code density
compared to MSP430 in most cases. The remaining tests show similar code density when
compared to MSP430.
2. One of the tests (firfilter) uses an integer data type for a constant array. Since an integer is
32‐bit in the ARM processor and is 16‐bit on MSP 430, the program has been modified to
allow a direct comparison.
3. When the large data memory model is used with MSP430, the code size increases by up to
20% (dhrystone).
4. We are unable to reproduce all of the claimed results in the Texas Instruments document.
This may be because the storage of constant data in ROM might have been omitted from
their code size calculations.
ARM Microcontroller Code Size Analysis | Additional investigation on floating point 13
Additional investigation on floating point
When analysing the results of the whetstone benchmark it became apparent that the MSP430 C
compiler only generated single precision floating operations, while the ARM C compiler generated
double precision operations for some of the math functions used.
After changing the code to use only single precision floating points the code size reduced
dramatically and resulted in much smaller code size than the MSP430 code size.
The IAR MSP430 compiler has an option to define floating point : “Size of type double” which is by
default set to 32‐bit (single precision). If it is set to 64‐bit (as in ARM C compiler), the code size
increased significantly.
Program size Generic MSP430 MSP430430F5438
Type Double is 32‐bit 6434 6308
Type Double is 64‐bit 11510 11798
These results match those seen for the ARM Cortex‐M3 processor.
Program size Cortex‐M3
Whetstone modified to use single precision only 4384
Out of box compile for whetstone (use double
precision for math functions)
8496
The option of setting type double to 32‐bit is quite sensible for small microcontroller applications
where the C code might only need to process source data generated from 12‐bit/14‐bit ADC.
Benchmarking using different default types can make a very big difference and not show accurate
comparative results.
ARM Microcontroller Code Size Analysis | Recommendations on how to get the smallest
code size with Cortex‐M microcontrollers
14
Recommendations on how to get the smallest code size with CortexM
microcontrollers
Use MicroLib
In the ARM development tools there is an option to use the area optimized MicroLIB rather than the
standard C libraries. The MicroLIB is suitable for most embedded applications and has a much
smaller code size when compared to the standard C library.
Ensure the use of area optimizations
The performance of Cortex‐M microcontrollers is much higher than that of 16‐bit and 8‐bit
microcontrollers so when porting applications from these microcontrollers you can generally select
the highest area optimization rather than selecting optimizations for speed. The resulting
performance will still be much higher than that of a 16‐bit or 8‐bit system running at the same clock
frequency.
Use the right data type
When porting applications from 8‐bit or 16‐bit microcontrollers, you might need to modify the data
type for constant arrays to achieve the most optimal program size. For example, an integer is
normally 16‐bit in 8‐bit and 16‐bit microcontrollers, while in ARM microcontrollers integers are 32‐
bit.
Type Number of bits in
8051
Number of bits in
MSP430
Number of bits in ARM
“char”, “unsigned char” 8 8 8
“enum” 8/16 16 8/16/32 (smallest is
chosen)
“short”, “unsigned short” 16 16 16
“int”, “unsigned int” 16 16 32
“long”, “unsigned long” 32 32 32
float 32 32 32
double 32 32 64
When porting a constant array of integers from an 8‐bit or 16‐bit architecture, you should modify
the data type from “int” to “short int” to make sure the constant array remains the same size. For
example,
const int mydata = { 1234, 5678, …};
This should be changed to:
const short int mydata = { 1234, 5678, …};
ARM Microcontroller Code Size Analysis | Recommendations on how to get the smallest
code size with Cortex‐M microcontrollers
15
For an array of integer variables (non‐constant data), changing from an integer to a short integer
might also prevent an increase in memory usage during software porting. Most other data (e.g.
variables) does not require modification.
Floating point functions
Some floating point functions are defined as single precision in 8‐bit or 16‐bit microcontrollers and
are by default defined as double precision in ARM microcontrollers, as we have found out with the
whetstone test analysis. When porting application code from 8‐bit or 16‐bit microcontrollers to an
ARM microcontroller, you might have to adjust math functions to single precision versions and
modify constant definitions to ensure that the program behaves in the same way. For example, in
the whetstone program code, a section of code uses some math functions that are double precision
in ARM compilers:
X=T*atan(T2*sin(X)*cos(X)/(cos(X+Y)+cos(X‐Y)‐1.0));
Y=T*atan(T2*sin(Y)*cos(Y)/(cos(X+Y)+cos(X‐Y)‐1.0));
If we want to use single precision only, the program code has to be changed to
X=T*atanf(T2*sinf(X)*cosf(X)/(cosf(X+Y)+cosf(X‐Y)‐1.0F));
Y=T*atanf(T2*sinf(Y)*cosf(Y)/(cosf(X+Y)+cosf(X‐Y)‐1.0F));
Other constant definitions such as:
/* Module 7: Procedure calls */
X = 1.0;
Y = 1.0;
Z = 1.0;
should to be changed to the following for single precision representation:
/* Module 7: Procedure calls */
X = 1.0F;
Y = 1.0F;
Z = 1.0F;
Define peripherals as data structure
You can also reduce program size by defining registers in peripherals as a data structure. For
example, instead of representing the SysTick timer registers as
#define SYSTICK_CTRL (*((volatile unsigned long *)(0xE000E010)))
#define SYSTICK_LOAD (*((volatile unsigned long *)(0xE000E014)))
#define SYSTICK_VAL (*((volatile unsigned long *)(0xE000E018)))
#define SYSTICK_CALIB (*((volatile unsigned long *)(0xE000E01C)))
ARM Microcontroller Code Size Analysis | Conclusions 16
you can define the SysTick registers as:
typedef struct
{
volatile unsigned int CTRL;
volatile unsigned int LOAD;
volatile unsigned int VAL;
unsigned int CALIB;
} SysTick_Type;
#define SysTick ((SysTick_Type *) 0xE000E010)
By doing this, you only need one address constant to be stored in the program ROM. The register
accesses will be using this address constant with different address offsets for different registers. If a
sequence of hardware register accesses is required for a peripheral, using a data structure can
reduce code size as well as improve performance. Most 8‐bit microcontrollers do not have the same
addressing mode feature which can result in a much larger code size for the same task.
Conclusions
32‐bit processors provide equal or more often better code size than 8‐bit and 16‐bit architectures
whilst at the same time delivering much better performance.
For users of 8‐bit microcontrollers, moving to a 16‐bit architecture can solve some of the inherent
problems with 8‐bit architectures, however, the overall benefits of migrating from 8‐bit to 16‐bit is
much less than that achieved by migrating to the 32‐bit Cortex processors.
As the power consumption and cost of 32‐bit microcontrollers has reduced dramatically over last
few years, 32‐bit processors have become the best choice for many embedded projects.
Reference
The following articles on MSP430 are referenced:
Reference
1 MSP430 Competitive Benchmarking
http:// focus.ti.com/lit/an/slaa205c/slaa205c.pdf
2 Efficient Multiplication and Division Using MSP430
http://focus.ti.com/lit/an/slaa329/slaa329.pdf

pengertian Mikrokontroler


Arsitektur ARM merupakan arsitektur prosesor 32-bit Reduced Instruction Set
Computer (RISC) yang dikembangkan oleh ARM Limited[1]. Pada awalnya merupakan
prosesor desktop yang sekarang didominasi oleh keluarga x86. Namun desain yang
sederhana membuat prosesor ARM cocok untuk aplikasi berdaya rendah. Prosesor ARM
digunakan di berbagai bidang seperti elektronik umum, termasuk PDA, mobile phone,
media player, music player, game console genggam, kalkulator dan periperal komputer
seperti hard disk drive dan router.

Dengan arsitektur RISC, maka pada arsitektur ARM dapat ditemukan fitur
seperti kebanyakan arsitektur RISC lainnya seperti:
• Register file yang berkapasitas besar
• Arsitektur load/store, dimana operasi pengolahan data hanya beroperasi pada
konten register, tidak secara langsung pada konten memori.
• Addressing mode sederhana, dimana seluruh load/store address ditentukan dari
konten register dan field instruksi saja.
Arsitektur ARM juga memiliki fitur tambahan seperti:
• Instruksi yang menggabungkan antara operasi aritmatik dan logika
• Auto-increment dan auto-decrement addressing mode untuk mengoptimalkan
loop program

Penyimpanan banyak instruksi untuk memaksimalkan throuhgput data
• Eksekusi secara kondisional untuk semua instruksi untuk memaksimalkan
throughput eksekusi

ARM7 LPC2368
LPC2368 adalah mikrokontroler dari keluarga prosesor ARM7TDMI
yang didesain untuk penggunaan aplikasi embedded real-time. Prosesor
ARM7TDMI memiliki Thumb support dan multiplier yang telah dikembangkan.
Arsitektur dari prosesor ini memiliki kapasitas hingga 130 MIPS dalam proses
standar 0,13um. Prosesor ini mengimplementasikan arsitektur V4T dan
mendukung instruksi 32-bit dan 16-bit melalui set instruksi ARM dan Thumb.
yang dapat digunakan pada pengontrolan industri, otomotif, dan penggunaan
lainnya yang membutuhkan performa tinggi dan konsumsi daya yang rendah
melalui mikrokontroler 32-bit.


Mikrokontroler ini dapat bekerja hingga 72MHz dari flash atau RAM dan
memiliki 512KB on-chip flash program memory serta periperal komunikasi yang
bervariasi termasuk Ethernet, USB, dan CAN. Keluarga mikrokontroler ini juga
memiliki fitur pengontrolan LCD (QVGA graphic atau segment driver),
antarmuka SD/MMC, antarmuka memori eksternal, dan antarmuka audio I2S.
2.1.2 Penggunaan Pin Connect Block Pada LPC2368
Pin connect block merupakan pengaturan pin dari mikrokontroler untuk
memiliki lebih dari satu fungsi. Register konfigurasi mengontrol multiplexer
untuk membuat koneksi antara pin dan periperal yang ada pada chip.
Periperal harus terhubung dengan pin secara tepat sebelum diaktifkan .
Aktifitas dari fungsi periperal yang tidak terdaftar pada pin yang bersangkutan
akan dianggap undefined. Pemilihan salah satu fungsi pada port pin akan
menonaktifkan fungsi lain yang ada pada pin yang sama.Penggunaan GPIO Pada LPC2368
GPIO PORT0 dan PORT1 dapat diakses dari kedua grup register dan
menyediakan fitur dan akses port yang lebih cepat. PORT2/3/4 hanya bisa
digunakan sebagai fast port. Fungsi GPIO terakselerasi (Fast I/O) adalah sebagai
berikut:
• Register GPIO dipindahkan ke local bus ARM sehingga timing I/O
tercepat bisa didapatkan.
• Seluruh register GPIO adalah berupa byte dan half-word addressable
• Seluruh port value dapat ditulis dalam satu instruksi
• Pengaturan arah kontrol dari masing-masing bit
• Setelah reset, seluruh I/O diset menjadi input.Penentuan mengenai port akan diakses oleh register dengan fitur
tambahan atau set register standar harus dilakukan saat PORT0 dan PORT1
digunakan. Apabila fitur tambahan dan register GPIO standar mengontrol pin
yang sama, kedua cabang port pengontrolan akan menjadi eksklusif dan
beroperasi secara terpisah. Misalnya, mengganti output pin melalui fast register
tidak dapat dilihat melalui register standar.
2.1.4 Penggunaan C Compiler
Pada dasarnya pemrograman C pada ARM merupakan pemrograman
standar C baik dari sisi fungsi yang digunakan hingga library yang tersedia.Mikrokontroler
2.2.1 Pengertian Mikrokontroler
Mikrokontroler adalah sebuah prosesor yang memiliki fungsi khusus
terutama dalam kepentingan pengontrolan. Meskipun bentuknya sangat kecil
tetapi elemen-elemen dasarnya sama. Seperti halnya komputer, mikrokontroler
juga merupakan alat yang mengerjakan perintah-perintah yang diberikan
kepadanya. Oleh karena itu, yang menjadi hal terpenting dalam suatu sistem yang
terkomputerisasi adalah program yang dibuat oleh programmer itu sendiri.
Program tersebut memberikan perintah pada komputer untuk menjalankan
deretan tugas-tugas sederhana untuk dapat melakukan perintah yang lebih
kompleks seperti yang diinginkan oleh programmer.
Beberapa fitur yang umumnya terdapat pada mikrokontroller, yaitu:



PowerPC Instruction Set


3.3.1.1 PowerPC Instruction Set
The PowerPC instructions are divided into the following categories:
• Integer instructions—These include computational and logical instructions.
— Integer arithmetic instructions
— Integer compare instructions
— Integer logical instructions
— Integer rotate and shift instructions
• Floating-point instructions—These include floating-point computational instructions, as well as
instructions that affect the FPSCR.

— Floating-point arithmetic instructions
— Floating-point multiply/add instructions
— Floating-point rounding and conversion instructions
— Floating-point compare instructions
— Floating-point status and control instructions
• Load/store instructions—These include integer and floating-point load and store instructions.
— Integer load and store instructions
— Integer load and store multiple instructions
— Floating-point load and store
— Primitives used to construct atomic memory operations (lwarx and stwcx. instructions)
• Flow control instructions—These include branching instructions, condition register logical
instructions, trap instructions, and other instructions that affect the instruction flow.
— Branch and trap instructions
— Condition register logical instructions
• Processor control instructions—These instructions are used for synchronizing memory accesses
and management of caches, TLBs, and the segment registers.
— Move to/from SPR instructions
— Move to/from MSR
— Synchronize
— Instruction synchronize
• Memory control instructions—These instructions provide control of caches, TLBs, and segment
registers.
— Supervisor-level cache management instructions
— User-level cache instructions
— Segment register manipulation instructions
— Translation lookaside buffer management instructions

mpc456


Note that this grouping of the instructions does not indicate which execution unit executes a particular
instruction or group of instructions.
Integer instructions operate on byte, half-word, and word operands. Floating-point instructions operate on
single-precision (one word) and double-precision (one double word) floating-point operands. The PowerPC
architecture uses instructions that are four bytes long and word-aligned. It provides for byte, half-word, and
word operand loads and stores between memory and a set of 32 GPRs. It also provides for word and doubleword

operand loads and stores between memory and a set of 32 floating-point registers (FPRs).
Computational instructions do not modify memory. To use a memory operand in a computation and then
modify the same or another memory location, the memory contents must be loaded into a register, modified,
and then written back to the target location with distinct instructions.
PowerPC processors follow the program flow when they are in the normal execution state. However, the
flow of instructions can be interrupted directly by the execution of an instruction or by an asynchronous
event. Either kind of exception may cause one of several components of the system software to be invoked.
3.3.1.2 Calculating Effective Addresses
The effective address (EA) is the 32-bit address computed by the processor when executing a memory
access or branch instruction or when fetching the next sequential instruction.
The PowerPC architecture supports two simple memory addressing modes:
• EA = (rA|0) + offset (including offset = 0) (register indirect with immediate index)
• EA = (rA|0) + rB (register indirect with index)
These simple addressing modes allow efficient address generation for memory accesses. Calculation of the
effective address for aligned transfers occurs in a single clock cycle.
For a memory access instruction, if the sum of the effective address and the operand length exceeds the
maximum effective address, the memory operand is considered to wrap around from the maximum effective
address to effective address 0.
Effective address computations for both data and instruction accesses use 32-bit unsigned binary arithmetic.
A carry from bit 0 is ignored in 32-bit implementations.
3.3.2 PowerPC 603 Microprocessor Instruction Set
The 603 instruction set is defined as follows:
• The 603 provides hardware support for all 32-bit PowerPC instructions.
• The 603 provides two implementation-specific instructions used for software table search
operations following TLB misses:
– Load Data TLB Entry (tlbld)
– Load Instruction TLB Entry (tlbli)
• The 603 implements the following instructions which are defined as optional by the PowerPC
architecture:
– External Control In Word Indexed (eciwx)
– External Control Out Word Indexed (ecowx)
– Floating Select (fsel)
– Floating Reciprocal Estimate Single-Precision (fres)
– Floating Reciprocal Square Root Estimate (frsqrte)
– Store Floating-Point as Integer Word (stfiwx)
18 PowerPC 603 RISC Microprocessor Technical Summary
3.4 Cache Implementation
The following subsections describe the PowerPC architecture’s treatment of cache in general, and the 603-
specific implementation, respectively.
3.4.1 PowerPC Cache Characteristics
The PowerPC architecture does not define hardware aspects of cache implementations. For example, some
PowerPC processors, including the 603, have separate instruction and data caches (Harvard architecture),
while others, such as the PowerPC 601™ microprocessor, implement a unified cache.
PowerPC microprocessors control the following memory access modes on a page or block basis:
• Write-back/write-through mode
• Cache-inhibited mode
• Memory coherency
Note that in the 603, a cache line is defined as eight words. The VEA defines cache management instructions
that provide a means by which the application programmer can affect the cache contents.
3.4.2 PowerPC 603 Microprocessor Cache Implementation
The 603 has two 8-Kbyte, two-way set-associative (instruction and data) caches. The caches are physically
addressed, and the data cache can operate in either write-back or write-through mode as specified by the
PowerPC architecture.
The data cache is configured as 128 sets of 2 lines each. Each line consists of 32 bytes, two state bits, and
an address tag. The two state bits implement the three-state MEI (modified/exclusive/invalid) protocol. Each
line contains eight 32-bit words. Note that the PowerPC architecture defines the term block as the cacheable
unit. For the 603, the block size is equivalent to a cache line. A block diagram of the data cache organization
is shown in Figure 3.
The instruction cache also consists of 128 sets of 2 lines, and each line consists of 32 bytes, an address tag,
and a valid bit. The instruction cache may not be written to except through a line fill operation. The
instruction cache is not snooped, and cache coherency must be maintained by software. A fast hardware
invalidation capability is provided to support cache maintenance. The organization of the instruction cache
is very similar to the data cache shown in Figure 3.
Each cache line contains eight contiguous words from memory that are loaded from an 8-word boundary
(that is, bits A27–A31 of the effective addresses are zero); thus, a cache line never crosses a page boundary.
Misaligned accesses across a page boundary can incur a performance penalty.
The 603’s cache lines are loaded in four beats of 64 bits each. The burst load is performed as “critical double
word first.” The cache that is being loaded is blocked to internal accesses until the load completes. The
critical double word is simultaneously written to the cache and forwarded to the requesting unit, thus
minimizing stalls due to load delays.
To ensure coherency among caches in a multiprocessor (or multiple caching-device) implementation, the
603 implements the MEI protocol. These three states, modified, exclusive, and invalid, indicate the state of
the cache block as follows:
• Modified—The cache line is modified with respect to system memory; that is, data for this address
is valid only in the cache and not in system memory.
• Exclusive—This cache line holds valid data that is identical to the data at this address in system
memory. No other cache has this data.
• Invalid—This cache line does not hold valid data.

Part 1 PowerPC 603 Microprocessor Overview


Part 1 PowerPC 603 Microprocessor Overview
This section describes the features of the 603, provides a block diagram showing the major functional units,
and gives an overview of how the 603 operates.
The 603 is the first low-power implementation of the PowerPC microprocessor family of reduced instruction
set computer (RISC) microprocessors. The 603 implements the 32-bit portion of the PowerPC architecture,
which provides 32-bit effective addresses, integer data types of 8, 16, and 32 bits, and floating-point data
types of 32 and 64 bits. For 64-bit PowerPC microprocessors, the PowerPC architecture provides 64-bit
integer data types, 64-bit addressing, and other features required to complete the 64-bit architecture.
The 603 provides four software controllable power-saving modes. Three of the modes (the nap, doze, and
sleep modes) are static in nature, and progressively reduce the amount of power dissipated by the processor.
The fourth is a dynamic power management mode that causes the functional units in the 603 to
automatically enter a low-power mode when the functional units are idle without affecting operational
performance, software execution, or any external hardware.
The 603 is a superscalar processor capable of issuing and retiring as many as three instructions per clock.
Instructions can execute out of order for increased performance; however, the 603 makes completion appear

sequential.
The 603 integrates five execution units—an integer unit (IU), a floating-point unit (FPU), a branch
processing unit (BPU), a load/store unit (LSU), and a system register unit (SRU). The ability to execute five
instructions in parallel and the use of simple instructions with rapid execution times yield high efficiency
and throughput for 603-based systems. Most integer instructions execute in one clock cycle. The FPU is
pipelined so a single-precision multiply-add instruction can be issued every clock cycle.
The 603 provides independent on-chip, 8-Kbyte, two-way set-associative, physically addressed caches for
instructions and data and on-chip instruction and data memory management units (MMUs). The MMUs
contain 64-entry, two-way set-associative, data and instruction translation lookaside buffers (DTLB and
ITLB) that provide support for demand-paged virtual memory address translation and variable-sized block
translation. The TLBs and caches use a least recently used (LRU) replacement algorithm. The 603 also
supports block address translation through the use of two independent instruction and data block address
translation (IBAT and DBAT) arrays of four entries each. Effective addresses are compared simultaneously
with all four entries in the BAT array during block translation. In accordance with the PowerPC architecture,
if an effective address hits in both the TLB and BAT array, the BAT translation takes priority.
The 603 has a selectable 32- or 64-bit data bus and a 32-bit address bus. The 603 interface protocol allows
multiple masters to compete for system resources through a central external arbiter. The 603 provides a
three-state coherency protocol that supports the exclusive, modified, and invalid cache states. This protocol
is a compatible subset of the MESI (modified/exclusive/shared/invalid) four-state protocol and operates
coherently in systems that contain four-state caches. The 603 supports single-beat and burst data transfers
for memory accesses; it also supports both memory-mapped I/O and direct-store interface addressing.
The 603 uses an advanced, 3.3-V CMOS process technology and maintains full interface compatibility with
TTL devices.
1.1 PowerPC 603 Microprocessor Features
This section describes details of the 603’s implementation of the PowerPC architecture. Major features of
the 603 are as follows:
• High-performance, superscalar microprocessor
— As many as three instructions issued and retired per clock
— As many as five instructions in execution per clock
PowerPC 603 RISC Microprocessor Technical Summary 3
— Single-cycle execution for most instructions
— Pipelined FPU for all single-precision and most double-precision operations
• Five independent execution units and two register files
— BPU featuring static branch prediction
— A 32-bit IU
— Fully IEEE 754-compliant FPU for both single- and double-precision operations
— LSU for data transfer between data cache and GPRs and FPRs
— SRU that executes condition register (CR) and special-purpose register (SPR) instructions
— Thirty-two GPRs for integer operands
— Thirty-two FPRs for single- or double-precision operands
• High instruction and data throughput
— Zero-cycle branch capability (branch folding)
— Programmable static branch prediction on unresolved conditional branches
— Instruction fetch unit capable of fetching two instructions per clock from the instruction cache
— A six-entry instruction queue that provides look-ahead capability
— Independent pipelines with feed-forwarding that reduces data dependencies in hardware
— 8-Kbyte data cache—two-way set-associative, physically addressed; LRU replacement
algorithm
— 8-Kbyte instruction cache—two-way set-associative, physically addressed; LRU replacement
algorithm
— Cache write-back or write-through operation programmable on a per page or per block basis
— BPU that performs CR look-ahead operations
— Address translation facilities for 4-Kbyte page size, variable block size, and 256-Mbyte
segment size
— A 64-entry, two-way set-associative ITLB
— A 64-entry, two-way set-associative DTLB
— Four-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte blocks
— Software table search operations and updates supported through fast trap mechanism
— 52-bit virtual address; 32-bit physical address
• Facilities for enhanced system performance
— A 32- or 64-bit split-transaction external data bus with burst transfers
— Support for one-level address pipelining and out-of-order bus transactions
— Bus extensions for direct-store interface operations
• Integrated power management
— Low-power 3.3-volt design
— Internal processor/bus clock multiplier that provides 1/1, 2/1, 3/1, and 4/1 ratios
— Three power saving modes: doze, nap, and sleep
— Automatic dynamic power reduction when internal functional units are idle
• In-system testability and debugging features through JTAG boundary-scan capability
4 PowerPC 603 RISC Microprocessor Technical Summary
1.2 Block Diagram
Figure 1 provides a block diagram of the 603 that illustrates how the execution units—IU, FPU, BPU, LSU,
and SRU—operate independently and in parallel.
The 603 provides address translation and protection facilities, including an ITLB, DTLB, and instruction
and data BAT arrays. Instruction fetching and issuing is handled in the instruction unit. Translation of
addresses for cache or external memory accesses are handled by the MMUs. Both units are discussed in
more detail in Sections 1.3, “Instruction Unit,” and 1.5.1, “Memory Management Units (MMUs).”
1.3 Instruction Unit
As shown in Figure 1, the 603 instruction unit, which contains a fetch unit, instruction queue, dispatch unit,
and BPU, provides centralized control of instruction flow to the execution units. The instruction unit
determines the address of the next instruction to be fetched based on information from the sequential fetcher
and from the BPU.
The instruction unit fetches the instructions from the instruction cache into the instruction queue. The BPU
extracts branch instructions from the fetcher and uses static branch prediction on unresolved conditional
branches to allow the instruction unit to fetch instructions from a predicted target instruction stream while
a conditional branch is evaluated. The BPU folds out branch instructions for unconditional branches or
conditional branches unaffected by instructions in progress in the execution pipeline.
Instructions issued beyond a predicted branch do not complete execution until the branch is resolved,
preserving the programming model of sequential execution. If any of these instructions are to be executed
in the BPU, they are decoded but not issued. Instructions to be executed by the FPU, IU, LSU, and SRU are
issued and allowed to complete up to the register write-back stage. Write-back is allowed when a correctly
predicted branch is resolved, and instruction execution continues without interruption along the predicted
path.
If branch prediction is incorrect, the instruction unit flushes all predicted path instructions, and instructions
are issued from the correct path.
PowerPC 603 RISC Microprocessor Technical Summary 5
Figure 1. PowerPC 603 Microprocessor Block Diagram
BRANCH
PROCESSING
UNIT
32-/64-BIT DATA BUS
32-BIT ADDRESS BUS
INSTRUCTION UNIT
INTEGER
UNIT
FLOATINGPOINT
UNIT
FPR File
FP Rename
Registers
8-Kbyte
D Cache
Tags
SEQUENTIAL
FETCHER
CTR
CR
LR
/ * +
FPSCR
SYSTEM
REGISTER
UNIT
/ * +
PROCESSOR BUS
INTERFACE
D MMU
SRs
DTLB
DBAT
Array
Touch Load Buffer
Copyback Buffer
64 BIT
32 BIT
Dispatch Unit
64 BIT
64 BIT
Power
Dissipation
Control
COMPLETION
UNIT
Time Base
Counter/
Decrementer
Clock
Multiplier
JTAG/COP
Interface
XER
I MMU
SRs
ITLB
IBAT
Array
8-Kbyte
I Cache
Tags
64 BIT
64 BIT
64 BIT
64 BIT 64 BIT
GPR File LOAD/STORE
UNIT
+
64-BIT
GP Rename
Registers
INSTRUCTION
QUEUE
6 PowerPC 603 RISC Microprocessor Technical Summary
1.3.1 Instruction Queue and Dispatch Unit
The instruction queue (IQ), shown in Figure 1, holds as many as six instructions and loads up to two
instructions from the instruction unit during a single cycle. The instruction fetch unit continuously loads as
many instructions as space in the IQ allows. Instructions are dispatched to their respective execution units
from the dispatch unit at a maximum rate of two instructions per cycle. Dispatching is facilitated to the IU,
FPU, LSU, and SRU by the provision of a reservation station at each unit. The dispatch unit performs source
and destination register dependency checking, determines dispatch serializations, and inhibits subsequent
instruction dispatching as required.
For a more detailed overview of instruction dispatch, see Section 3.7, “Instruction Timing.”
1.3.2 Branch Processing Unit (BPU)
The BPU receives branch instructions from the fetch unit and performs CR look-ahead operations on
conditional branches to resolve them early, achieving the effect of a zero-cycle branch in many cases.
The BPU uses a bit in the instruction encoding to predict the direction of the conditional branch. Therefore,
when an unresolved conditional branch instruction is encountered, the 603 fetches instructions from the
predicted target stream until the conditional branch is resolved.
The BPU contains an adder to compute branch target addresses and three user-control registers—the link
register (LR), the count register (CTR), and the CR. The BPU calculates the return pointer for subroutine
calls and saves it into the LR for certain types of branch instructions. The LR also contains the branch target
address for the Branch Conditional to Link Register (bclrx) instruction. The CTR contains the branch target
address for the Branch Conditional to Count Register (bcctrx) instruction. The contents of the LR and CTR
can be copied to or from any GPR. Because the BPU uses dedicated registers rather than GPRs or FPRs,
execution of branch instructions is largely independent from execution of integer and floating-point
instructions.
1.4 Independent Execution Units
The PowerPC architecture’s support for independent execution units allows implementation of processors
with out-of-order instruction execution. For example, because branch instructions do not depend on GPRs
or FPRs, branches can often be resolved early, eliminating stalls caused by taken branches.
In addition to the BPU, the 603 provides four other execution units and a completion unit, which are
described in the following sections.
1.4.1 Integer Unit (IU)
The IU executes all integer instructions. The IU executes one integer instruction at a time, performing
computations with its arithmetic logic unit (ALU), multiplier, divider, and integer exception register (XER).
Most integer instructions are single-cycle instructions. Thirty-two general-purpose registers are provided to
support integer operations. Stalls due to contention for GPRs are minimized by the automatic allocation of
rename registers. The 603 writes the contents of the rename registers to the appropriate GPR when integer
instructions are retired by the completion unit.
1.4.2 Floating-Point Unit (FPU)
The FPU contains a single-precision multiply-add array and the floating-point status and control register
(FPSCR). The multiply-add array allows the 603 to efficiently implement multiply and multiply-add
operations. The FPU is pipelined so that single-precision instructions and double-precision instructions can
be issued back-to-back. Thirty-two floating-point registers are provided to support floating-point operations.
Stalls due to contention for FPRs are minimized by the automatic allocation of rename registers. The 603
PowerPC 603 RISC Microprocessor Technical Summary 7
writes the contents of the rename registers to the appropriate FPR when floating-point instructions are
retired by the completion unit.
The 603 supports all IEEE 754 floating-point data types (normalized, denormalized, NaN, zero, and infinity)
in hardware, eliminating the latency incurred by software exception routines. (The term, ‘exception’ is also
referred to as ‘interrupt’ in the architecture specification.)
1.4.3 Load/Store Unit (LSU)
The LSU executes all load and store instructions and provides the data transfer interface between the GPRs,
FPRs, and the cache/memory subsystem. The LSU calculates effective addresses, performs data alignment,
and provides sequencing for load/store string and multiple instructions.
Load and store instructions are issued and translated in program order; however, the actual memory accesses
can occur out of order. Synchronizing instructions are provided to enforce strict ordering.
Cacheable loads, when free of data dependencies, execute in a speculative manner with a maximum
throughput of one per cycle and a two-cycle total latency. Data returned from the cache is held in a rename
register until the completion logic commits the value to a GPR or FPR. Stores cannot be executed
speculatively and are held in the store queue until the completion logic signals that the store operation is to
be completed to memory. The time required to perform the actual load or store operation varies depending
on whether the operation involves the cache, system memory, or an I/O device.
1.4.4 System Register Unit (SRU)
The SRU executes various system-level instructions, including condition register logical operations and
move to/from special-purpose register instructions. In order to maintain system state, most instructions
executed by the SRU are completion-serialized; that is, the instruction is held for execution in the SRU until
all prior instructions issued have completed. Results from completion-serialized instructions executed by
the SRU are not available or forwarded for subsequent instructions until the instruction completes.
1.4.5 Completion Unit
The completion unit tracks instructions from dispatch through execution, and then retires, or “completes”
them in program order. Completing an instruction commits the 603 to any architectural register changes
caused by that instruction. In-order completion ensures the correct architectural state when the 603 must
recover from a mispredicted branch or any exception.
Instruction state and other information required for completion is kept in a first-in-first-out (FIFO) queue of
five completion buffers. A single completion buffer is allocated for each instruction once it enters the
dispatch unit. An available completion buffer is a required resource for instruction dispatch; if no
completion buffers are available, instruction dispatch stalls. A maximum of two instructions per cycle are
completed in order from the queue.
1.5 Memory Subsystem Support
The 603 provides support for cache and memory management through dual instruction and data memory
management units. The 603 also provides dual 8-Kbyte instruction and data caches, and an efficient
processor bus interface to facilitate access to main memory and other bus subsystems. The memory
subsystem support functions are described in the following subsections.
1.5.1 Memory Management Units (MMUs)
The 603’s MMUs support up to 4 Petabytes (252) of virtual memory and 4 Gigabytes (232) of physical
memory (referred to as real memory in the architecture specification) for instruction and data. The MMUs
also control access privileges for these spaces on block and page granularities. Referenced and changed