How to Retrieve Files from a Server Using a Python Web Server and Copy Command

Encountering difficulties when you need to retrieve files from a server without the usual tools like FTP or SCP can be a common challenge for system administrators and developers. Imagine a scenario where you have a legacy Ubuntu Server 14.04, and you need to grab a directory of files to update your web application. You’re restricted from installing new software, meaning no quick fixes like setting up an SSH server or using scp. What options do you have?

This article explores a practical and efficient method to overcome this hurdle: Using A Python Web Server And Then Copy the files using a simple command-line tool like wget or curl. This approach is particularly useful when you’re in a pinch and need a fast, software-agnostic solution to extract files from a server.

Setting Up a Simple Python Web Server

Python, often pre-installed on many Linux distributions including Ubuntu 14.04, comes with a built-in module to serve files over HTTP. This eliminates the need for installing and configuring complex web servers like Apache or Nginx. Here’s how you can quickly set up a Python web server from your Ubuntu server’s command line:

Navigate to the Directory: First, use the cd command to navigate to the directory you wish to copy. For example, if you want to copy the directory /var/www/html, use:
```
cd /var/www/html
```
Start the Python SimpleHTTPServer: Python 2 and Python 3 have slightly different commands.
- For Python 2:
```
python -m SimpleHTTPServer 8000
```
- For Python 3:
```
python3 -m http.server 8000
```
In these commands, 8000 is the port number the web server will use. You can choose any port number that is not in use.

Alt text: Command line examples to start a Python SimpleHTTPServer on port 8000 for both Python 2 and Python 3.

Once executed, this command starts a basic HTTP server serving files from the current directory. It will typically display a message indicating the server is running, often showing the port number.

Copying Files Using `wget` or `curl`

With the Python web server running on your Ubuntu server, you can now use another machine on the same network to copy the files. The wget and curl command-line tools are excellent for downloading files from web servers and are usually available on most operating systems.

Identify the Server’s IP Address: You’ll need to know the IP address of your Ubuntu server. You can find this using the command ifconfig or ip addr on the Ubuntu server itself.
Use wget or curl to Copy: On your local machine (the machine you want to copy files to), open a terminal or command prompt.
- Using wget to recursively download the directory:
```
wget -r -np -nH --cut-dirs=1 http://<server-ip>:8000/
```
  - -r: Recursive download, meaning it will download directories and subdirectories.
  - -np: No parent, prevents going up to the parent directory.
  - -nH: No host directory, files will be saved in the current directory, not in a directory named after the server.
  - --cut-dirs=1: Removes the first directory component, useful if you are in the root of the served directory.
  - http://<server-ip>:8000/: The URL of your Python web server, replace <server-ip> with the actual IP address of your Ubuntu server.
- Using curl to list directory contents and then download (more manual):
  
  curl is generally used for single file downloads but can be used to list directory contents, which you could then script to download files. For simple directory listing:
```
curl http://<server-ip>:8000/
```
  This will display the HTML directory listing in your terminal. For actual file download with curl, you’d typically need to know the specific file names and download them individually or write a script to parse the directory listing.
Alt text: Example command using wget to recursively download files from a Python SimpleHTTPServer, highlighting key options for directory retrieval.
Wait for the Download to Complete: Depending on the size of the directory and your network speed, the download might take some time. wget will show the progress in your terminal.

Security Considerations

While the Python SimpleHTTPServer is incredibly convenient, it’s crucial to remember it is not designed for production environments or secure data transfer. It lacks security features and should only be used for temporary, non-sensitive file retrieval on a trusted network.

Exposure: The files are exposed over HTTP, meaning they are not encrypted during transfer.
No Authentication: Anyone who can access the server’s IP and port can access the files being served.
Temporary Use: Always remember to shut down the Python web server once you have completed your file transfer to minimize any potential security risks. You can stop the server by pressing Ctrl+C in the terminal where it is running on the Ubuntu server.

Conclusion

Using a Python web server and then copying files with wget or curl offers a quick, efficient, and software-independent method to retrieve files from a server when you are restricted from installing new software or using standard file transfer protocols. This technique is especially valuable in emergency situations or when dealing with legacy systems where conventional methods are not readily available. While it’s essential to be mindful of the security implications and use it judiciously, it remains a powerful tool in a system administrator’s toolkit for rapid file retrieval.