One of the basic functions provided by enterprise IT is the hosting of file services in an organization. Since the early days of computer networks, having shared network locations to store and edit documents and other file resources has been a basic requirement.
As the need for file and network shares gain more and more momentum, IT admins in many enterprise IT environments found themselves managing numerous file shares, server names, network resources and such simply to manage files and network share resources across the organization. The management of a large number of network shares across different server resources can become very labor-intensive.
Microsoft introduced a solution to help organizations deal with and manage file shares across their organization to logically group them into a single hierarchical structure. The technology introduced is known as Distributed File System or DFS.
In this post, we will take a look at what DFS is, how it works, requirements and considerations, best practices, and finally, how it is configured.
What is a Distributed File System (DFS)?
Distributed File System or DFS as touched on in the introduction provides the ability to logically group shares found on multiple servers and to transparently link shares into a single hierarchical namespace. This is organized in a treelike structure. DFS supports multiple modes including both stand-alone and domain-based DFS services.
Usage of domain-based namespaces is required when you want to provide high availability of the namespace. As with other Microsoft technologies that are replicated along with Active Directory, with DFS, the topology data for a DFS namespace is stored in Active Directory.
DFS uses the Windows Server file replication service to copy changes between replicated targets. When users modify files stored on one target, DFS replication propagates the changes across to the other designated targets in the DFS infrastructure. The most recent changes are preserved.
DFS is an interesting technology that abstracts the underlying physical file servers where the actual shares reside, from the namespace of how the shares are accessed. In situations where there may be tens or hundreds of file servers and shares, this can become a management nightmare. The DFS namespace aggregates and abstracts this underlying complexity from the end-users.
This is not only beneficial from the end-user perspective but also the IT admin who, with DFS, has greater flexibility to manage the underlying physical storage backing the DFS hosted shares. If more storage is needed, the IT admin can add a new storage device and share, copy over the files and synchronize them from the old device to the new device, and simply retarget the DFS link to point to the new share on the backend.
Does DFS only work with Microsoft Windows Servers?
A share that is published on a server that is running on a non-Windows Server cannot host a DFS root or provide referrals to other DFS targets. However, non-Windows backed shares can be published for which client redirectors are available in a DFS namespace. This can include any SMB-compatible device such as network-attached storage (NAS) devices from many different vendors as well as Samba shares. It does not work with NFS or HDFS.
How are DFS namespaces organized?
This is really up to the needs of the business. Common DFS organization may be related to the business organizational unit, the geographical location, combinations of both, or perhaps other custom business entities to define a DFS namespace.
What is included with the DFS topology data?
Components of the DFS tree structure include:
- DFS Root – This is a DFS server running the DFS service
- DFS links – DFS links point to network shares consolidated in DFS
- DFS Targets – The targets are the actual network shares the DFS links point to
The components making up a DFS namespace include the following:
- Namespace server – This is the DFS server that hosts a namespace. This can be a member server or a domain controller.
- Namespace root – This is the starting point of the DFS namespace tree. In the case of a domain-based DFS topology, it will start with the domain name. With domain-based DFS topologies, the DFS metadata is stored in Active Directory and replicated between the ADDS servers. You can have multiple namespace servers hosting the DFS namespace.
- Folder – There are two types of folders in the DFS namespace – folders without targets and folders with targets. The folders without targets are simply for the organization of the structure. Folders with targets link to the actual content that end users can access.
- Folder targets – Folder targets are the actual UNC paths to a shared folder associated with the folder in a DFS namespace. The folder target is where data is actually found. The great thing about domain-based DFS namespaces, the DFS namespace is Active Directory Site aware. If a user in one geographic location accesses the same DFS namespace folder target, they will be directed to the server hosting the data in the same site which enhances the user experience and prevents unnecessary WAN traffic traversing across.
High-level overview of the DFS namespace components (Image courtesy of Microsoft)
Below are the feature characteristics of the Distributed File System (DFS)
|Windows Server host service
|Storage hardware agnostic
|Windows Server Stretched cluster creation
|Non-Windows File Shares
|TCP/IP or RDMA
|Network constraint support
|iWARP, InfiniBand, RoCEv2
|Replication network port
|TCP 445 or 5445
|Overwrite encryption and signing
|Per-volume failovers allowed
|Thin-provisioned storage support
|PowerShell, Failover Cluster Manager
Domain-based DFS Namespaces
There are a couple of use cases for using the domain-based DFS namespaces.
We have already touched briefly on why a domain-based DFS namespace would be beneficial, however, choose domain-based DFS namespaces for the following:
- Is the high-availability for the DFS namespace needed? Use domain-based in this case. By replicating the namespace between DFS namespace servers, HA is achieved.
- Domain-based DFS namespaces also provide a really easy way to abstract the underlying DFS server name by simply presenting the share in the \\< Domain Name >\namespace format. Coupled with Access-based Enumeration or ABE, users are only presented with the files in a namespace that they have access to.
To use the domain-based DFS namespace topology:
- Make sure your Active Directory forest functional level is Windows 2008 or higher
- Domain functional level needs to be at Windows Server 2008 or higher
- Namespace servers must be running Windows Server 2008 or higher
How Distributed File System (DFS) Works
A key component to the DFS working is DFS Replication.
DFS Replication in Windows Server is a role service that allows replicating the folders referred to by a DFS namespace path across multiple servers and sites. DFS replication is configured as a multi-master replication technology meaning any member of the DFS replication group can make changes to the data.
DFS replication makes use of an efficient compression algorithm called Remote Differential Compression (RDC) to detect changes to data. RDC is extremely efficient in that it can detect changes to a file and only copy the changed file blocks instead of recopying the entire file.
A DFS replication group mentioned earlier is a group of servers that participates in the replication of one or more replicated folders. A replicated folder stays synchronized between the members included in the DFS replication group. The communication between the various DFS replication group members forms the DFS replication topology.
The settings for the replication group including its topology, schedule, and bandwidth throttling are applied to the replicated folders contained in the DFS replication group.
DFS Replication Group (Image courtesy of Microsoft)
Each replicated folder in the DFS replication group has unique settings including the file and subfolder filters to filter out different files and subfolders for each replicated folder. Replicated folder can be located on different volumes in the member and do not need to be shared folders or part of a namespace. DFS Replication can be managed by using the DFS Management console, DfsrAdmin and Dfsrdiag commands or scripts that call WMI.
An important note to consider when looking at how DFS replication works is that DFS replicates a file only after it is closed. This means that it is not a suitable solution for replicating files that may constantly be in use like database files or other files that are open for an extended period of time. For documents or other files that need to be worked on in parallel with other users, you may want to look into other technologies like Storage Replica that was introduced in Windows Server 2016.
Distributed File System (DFS) Requirements
There are a few requirements and considerations to make note of when thinking of deploying a Distributed File System (DFS).
Servers that are running the following operating systems can host multiple domain-based namespaces in addition to a single stand-alone namespace.
- Windows Server 2019
- Windows Server 2016
- Windows Server 2012 R2
- Windows Server 2012
- Windows Server 2008 R2 Datacenter and Enterprise Editions
- Windows Server (Semi-Annual Channel)
Servers that are running the following operating systems can host a single stand-alone namespace:
- Windows Server 2008 R2 Standard (Windows Server 2008/R2 are at end of extended support.
What are the requirements to host a DFS namespace?
|Server Hosting Stand-Alone Namespaces
|Server Hosting Domain-Based Namespaces
|An NTFS is required to host a DFS namespace
|Must contain an NTFS volume to host the namespace.
|Either member servers or domain controllers can be used
|Must be a member server or domain controller in the domain in which the namespace is configured. (This requirement applies to every namespace server that hosts a given domain-based namespace.)
|Failover clusters can be used for high availability to host the DFS namespace for resiliency
|The namespace cannot be a clustered resource in a failover cluster. However, you can locate the namespace on a server that also functions as a node in a failover cluster if you configure the namespace to use only local resources on that server.
What are the requirements to deploy DFS replication?
- Update the Active Directory Domain Services (ADDS) schema to include Windows Server 2003 R2 or later schema additions. No read-only replicated folders with the Windows Server 2003 R2 or older schema additions.
- All servers in a DFS replication group must be located in the same Active Directory forest. You cannot enable replication across servers in different AD forests.
- Install DFS Replication on all servers that will act as members of a replication group.
- Ensure you have the proper antivirus exceptions in place for DFS replication as these can trigger false positives in many antivirus solutions.
- Locate folders that you want to replicate with DFS on NTFS formatted volumes. As of yet, DFS Replication does not support the Resilient File System (ReFS) or the FAT file system. DFS Replication also does not support replicating content stored on Cluster Shared Volumes.
Considerations in Using Distributed File System (DFS) with Azure
DFS can be used in conjunction with Azure, but there are a few considerations that you will want to make in using the two together. It has been tested as documented by Microsoft to use DFS with Azure virtual machines.
Below are a few points to note with DFS and Azure:
- No clustering of stand-alone namespaces in Azure VMs
- Domain-based namespaces can be run on Azure VMs, including Azure environments running Azure Active Directory
- Using snapshots or saved states to restore VMs DFS Replication causes DFS Replication to fail, which can require special database recovery steps.
- Don’t export, copy, or clone DFS Replication VMs
- Use backup software from within the guest virtual machine
- You will need to have a site-to-site VPN connection between your on-premises replication group members and members hosted in Azure VMs with appropriate firewall port exceptions for communication between them. This includes RPC Endpoint Mapper (port 135) and randomly assigned ephemeral ports.
Configuring Distributed File System (DFS)
Let’s take a look at how to configure Distributed File System in Windows Server 2019.
To install the Distributed File System DFS on a Windows Server, it involves adding a role to your servers. The DFS roles are actually a subcomponent of the File and Storage Services role.
Installing the DFS Namespaces and DFS Replication roles
Below is a look at the available DFS roles when using the Get-WindowsFeature PowerShell cmdlet.
Using PowerShell to view the available DFS roles
You can install the DFS roles by using either the Server Manager console or using PowerShell.
Installing DFS roles using Server Manager
PowerShell is a great way to install DFS roles quickly and easily. This is also great for the automation and mass installation of roles across many servers.
- Install-WindowsFeature -name “FS-DFS-Namespace” -IncludeManagementTools
- Install-WindowsFeature -name “FS-DFS-Replication” -IncludeManagementTools
Using PowerShell to install the DFS roles on a Windows Server
Creating a DFS Namespace
You can launch the DFS Management Console by using the command dfsmgmt.msc.
As you can see the management console is fairly straightforward. Under the Actions pane, you can click the New Namespace button to begin creating a new DFS namespace.
Launching the DFS console and beginning to create a new DFS namespace
The New Namespace Wizard launches. The first step is entering the name of the server that will host the namespace. You can either enter the server name or click Browse and find the server name.
Enter the Namespace server
Enter the Namespace Name and Settings. As noted, the name you choose is the name that appears after the server or domain name in the namespace path. You can also click the Edit Settings button and get more granular with the local path and settings.
Enter the name for the namespace
The next step is choosing a Namespace Type. This is where you can choose between a domain-based and a standalone namespace server. Also, you can choose the Enable Windows Server 2008 mode (enabled by default) which sets up the namespace with increased scalability and access-based enumeration ABE. ABE means that end users only see the files presented which they have access to.
Choose stand-alone namespace, if:
- Your organization does not use Active Directory Domain Services (AD DS).
- You use the failover cluster to increase the availability of the namespace.
- You want to create a single namespace with more than 5,000 DFS folders in a domain if you don’t meet the requirements for a domain-based namespace(Windows Server 2008 mode).
The following minimum requirements to the Windows Server 2008 mode
- The forest uses the Windows Server 2003 or higher forest functional level.
- The domain uses the Windows Server 2008 or higher domain functional level.
- All namespace servers are running Windows Server 2008 and later.
Choose domain-based namespace, if:
- You use multiple namespace servers to ensure the availability of the namespace.
- You want to hide the name of the namespace server from users.
Choosing the Namespace Type
Review the settings and create the namespace.
Reviewing DFS Namespace settings and completing the configuration
The create namespace wizard completes successfully.
Finishing the new namespace wizard
Adding a DFS Target to a Namespace
Next, we need to add a target to the DFS namespace. Right-click the namespace and choose New Folder.
Adding a new folder to a DFS namespace
Adding a folder to the namespace
You can enter either a UNC path for a remote server share, or you can click the Browse button and select a local server folder.
Adding a folder target to the DFS namespace
Here, we have added a local folder.
Adding a local folder target to a DFS namespace
Creating a New Replication Group
Finally, let’s walk through creating a replication group that will replicate the source DFS housed folder to the target server. This will copy the data from the source to the destination.
***Note*** files are not replicated until they are closed.
Creating a New Replication Group
Choose the Replication Group Type.
Choose the ‘Multipurpose replication group’ if you would like to configure custom replication topologies. It also allows you to create a custom replication topology by first adding a set of servers to the replication group and then configuring custom connections between them to achieve the desired custom replication topology.
There are three options available under the Multipurpose replication group: Hub and spoke, Full mesh and No topology.
Hub and spoke
It can be used with three or more servers. Each spoke can use one or more hub members to replicate data. Multiple hubs can be used for redundancy in case any one of them becomes unavailable. Hubs should host the same replication data.
It can be used between two or more servers. In a full mesh topology, data is replicated between all replication members. It’s recommended that you always use this topology if the DFS replication group is composed of less than 10 servers.
You will be able to enable DFS connections once the wizard is completed. No replication will occur until the connections are configured.
Choose the ‘Replication group for data collection’, if you want to add two servers to a replication group in such a way that a hub (destination) server can be configured to collect data from another branch server. You cannot keep on adding new members to a data collection replication group, but you can keep on creating additional replication groups that all have the same hub server, to make up a kind of hub and spoke topology if you had multiple ‘branch offices.’
Multipurpose Replication Group is a more versatile setup and can operate in hub or mesh mode. If you’re not sure what to pick, pick a Multipurpose Replication Group.
Here we are choosing a Multipurpose replication group. This option configures replication between two or more servers for publication content sharing and other scenarios.
Choosing the Replication Group Type
Choose the Name and Domain.
Choosing Name and Domain
Choose the Replication Group Members. Here we have selected the two DFS servers we want to include in the Replication Group.
Choosing the Replication Group Members
Next, you choose the Replication Topology. If you have only two servers, the Hub and Spoke option is greyed out. By default, with two servers, you will see Full Mesh selected.
Choosing the replication topology for DFS replication
With DFS Replication, you can choose the replication schedule and the bandwidth you want to allocate to the DFS replication process.
Replication Group Schedule and Bandwidth configuration
Select the Primary Member for the DFS Replication group.
Selecting the Primary Member for the DFS Replication Group
Choose your folders to replicate to the replication members. Click the Add button to add these folders.
Add Folders to Replicate
Choose the local path on the replication members where the replicated folder will be replicated.
Configuring the local path of replicated folders on other members
Review the Replication Settings and click Create to create the replication group.
Reviewing and creating the replication group
Replication group is created successfully.
Creating the replication group
You will see an informational message about the possible replication delay.
Informational message about DFS replication delay
After only a few moments, the source file is replicated to the DFS replication member. Below, a file was created on the source server and within a few moments, the file had replicated to the DFS replication group member, with the same contents.
Replication target receives the file from the DFS replication process
DFS Does Not Replace Backups
There could be a misconception about DFS and its use cases to assume that DFS would serve as a form of backup since data can be replicated between a number of replication members in the DFS replication group. However, keep in mind, while the replication process can replicate data that would serve to create additional copies of data on additional servers, this does not protect from data loss as a result of end-user mistakes and security threats like ransomware.
Due to the replication process, changes made by end-users or ransomware would be replicated to the other servers in the DFS replication group. Businesses still need to backup their data and keep multiple restore points aligning with business needs, to truly protect their data.
Windows Distributed File System (DFS) is a great way to scale many Windows Server file shares across a network. It allows easily aggregating file shares logically and abstracting the namespace from the actual underlying network share name. This can make it much easier for end-users to reach resources and for IT admins to perform maintenance on the underlying storage contributing to the shares.
The process to create a DFS namespace, add target shares/folders, and create replication groups is very straightforward using either GUI or command-line options including PowerShell to control DFS. DFS is not a backup solution for your data. While it can replicate data to multiple servers, this does not protect your business from end-user related data loss or data loss as a result of cybersecurity threats like ransomware.