Introduction to Address Resolution Protocol (ARP)
IP datagrams contain IP addresses, but the physical network interface on a host or router only understands the link-layer addressing scheme of that network (for example, an Ethernet MAC address). Therefore, before sending a datagram on a local network, the sender must determine the corresponding link-layer (physical) address for a given IP address. The protocol that provides this mapping dynamically is the Address Resolution Protocol (ARP).
Why address resolution is needed
Network layer addresses (IPv4 or IPv6) and link layer addresses (Ethernet MAC, Token Ring, etc.) serve different purposes and have different formats and sizes. Routers and hosts forward packets using link-layer addresses on a single broadcast or multiaccess link. To transmit a packet to an IP destination that is on the same physical network, a sender must know the target's link-layer address.
One simple but impractical approach is to encode the physical address into the host portion of the IP address. For example, a host whose physical address is 00100001 01001001 (the upper byte has decimal value 33 and the lower byte 81) might be given the IP address 128.96.33.81. This technique is limited by the size of the host field in the IP address (for instance, a class C network offers only 8 host bits) and is not compatible with common 48-bit Ethernet MAC addresses. A general, flexible solution is to keep IP and link addresses separate and map between them dynamically using ARP.
Mapping options and the role of ARP
Mapping between IP addresses and link-layer addresses may be accomplished in several ways:
- Static configuration: the mapping table is configured manually by an administrator and distributed to hosts.
- Centralised service: a server maintains mappings and replies to queries from hosts.
- Dynamic, distributed learning: each host learns mappings by exchanging messages on the local network.
ARP implements the dynamic, distributed approach. Its goal is to enable each host to build a local table of mappings between IP addresses and link-layer addresses (commonly called the ARP cache or ARP table) and to keep those mappings reasonably current.
An ARP packet carries addresses and a few control fields so that a sender and target can agree on the mapping. The same ARP structure is used for different link layers and different higher-level protocols; the differences are captured by length/type fields. An ARP packet contains the following key fields:
- Hardware Type - specifies the type of the physical network (for example, Ethernet is type 1).
- Protocol Type - specifies the higher-level protocol for which the mapping is requested (for example, IPv4 is 0x0800).
- HLen (hardware address length) - the length in bytes of the link-layer address (for Ethernet this is 6).
- PLen (protocol address length) - the length in bytes of the protocol address (for IPv4 this is 4).
- Operation - indicates the ARP message type (commonly a request or a reply).
- Sender hardware address - the link-layer address of the host that sent the ARP message.
- Sender protocol address - the IP address of the host that sent the ARP message.
- Target hardware address - the link-layer address of the target (in an ARP request this is typically zero or unknown).
- Target protocol address - the IP address whose link-layer address is being sought.
How ARP works (request and reply)
The ARP exchange on an Ethernet-like broadcast network typically proceeds as follows:
- A host that wants to send an IP packet to an IP address that is on the same local network first checks its ARP cache.
- If an entry exists, the host obtains the corresponding link-layer address and transmits the frame to that address.
- If no entry exists, the host broadcasts an ARP request to the local network. The ARP request contains the sender's hardware and protocol addresses and the target protocol address; the target hardware address field is left zero (unknown).
- Every host on the local network receives the broadcast and compares the target protocol address in the ARP request with its own IP address.
- The host whose IP address matches the target address sends an ARP reply directly (unicast) to the requester. The reply contains the target's hardware address and its protocol address.
- Upon receiving the ARP reply, the original requester records the mapping (IP → link address) in its ARP cache and proceeds to send the IP datagram encapsulated in a link-layer frame to the resolved hardware address.
ARP cache and timeouts
Entries in the ARP cache are not permanent. Because interfaces may change addresses and devices may join or leave the network, ARP entries are aged and removed after a timeout period unless refreshed. Typical implementations time out dynamic ARP entries on the order of minutes (commonly around 15 minutes), though the exact value depends on the operating system and configuration. Hosts may also maintain static ARP entries that do not time out; these are configured manually for special cases.
Generalisations and related protocols
- Variable address sizes: ARP is general enough to support different link layers and different protocol address lengths by using the HLen and PLen fields.
- Proxy ARP: a router can respond to ARP requests on behalf of another host, making that host appear to be on the local network. This can be used to provide connectivity between segments without explicit routing configuration, but it is used only in specific scenarios.
- Gratuitous ARP: a host may send an ARP request or reply for its own IP address to announce or verify an IP→MAC mapping (commonly used to detect IP address conflicts and to update other hosts' ARP caches).
- Reverse ARP (RARP) and successors: RARP was an early protocol used by diskless hosts to discover their IP address given a known link-layer address. RARP has largely been superseded by DHCP and BOOTP.
Security and operational issues
- ARP spoofing/poisoning: because ARP is a trust-based protocol with no authentication, a malicious host can send forged ARP replies that associate its MAC address with the IP address of another host (for example, a gateway). This can be used for man-in-the-middle attacks or traffic interception.
- Mitigations: use of static ARP entries for critical hosts, network switch features such as port security, DHCP snooping, and dynamic ARP inspection can reduce the risk of ARP-based attacks.
- Performance: ARP generates broadcast traffic for unresolved addresses. Good caching policy and properly sized timeouts balance timeliness and overhead.
Using ARP results in forwarding
When a host learns an IP→link mapping via ARP, it typically adds or updates that information in its local ARP cache. This mapping can be used directly by the host when constructing the link-layer frame for an outgoing packet. In routers and layer-3 switches, the ARP mapping may be shown as an extra column in a forwarding/next-hop table so that the device can encapsulate packets correctly on the outgoing interface.
Example: ARP exchange on an Ethernet LAN
- A sends an IP packet to B. A looks up B's IP in its ARP cache and finds no entry.
- A broadcasts an ARP request: "Who has IP X.X.X.X? Tell A at MAC AA:AA:AA:AA:AA:AA."
- All hosts receive the request; B recognises its IP and unicasts an ARP reply to A: "IP X.X.X.X is at MAC BB:BB:BB:BB:BB:BB."
- A updates its ARP cache with the mapping (X.X.X.X → BB:BB:BB:BB:BB:BB) and sends the original IP packet inside an Ethernet frame addressed to B's MAC.
Summary
ARP is the standard mechanism used on IPv4 networks to map network layer addresses to link-layer addresses dynamically. It is simple, flexible (supports multiple hardware and protocol types via type/length fields), and widely implemented. Correct operation of ARP and sensible cache management are essential for reliable local delivery of IP datagrams. Awareness of ARP's limitations and security issues is important in network design and operations.