Introduction


Using «RTF to HTML .Net», any application .NET can easily convert DOCX documents to HTML and XHTML format. For example, to convert DOCX to HTML in C#, you will only need to add a link to the file .dll and enter a few lines of code:

SautinSoft.RtfToHtml r = new SautinSoft.RtfToHtml();
r.OpenDocx(@"c:\Quiet Flows the Don.docx");

r.OutputFormat = SautinSoft.RtfToHtml.eOutputFormat.HTML_5;
r.ToHtml(@"c:\Quiet Flows the Don.html");

This component provides you with a complete set of APIs for converting DOCX to HTML. In addition, during the conversion to HTML, you can configure the following parameters:

  • Various output formats: HTML 3.2, HTML 4.01, HTML5, XHTML, Text.
  • Generating output document in plain HTML 3.2 without CSS.
  • Whether to store images on filesystem or embed them into HTML document using base64 encoding.
  • Save CSS data between tags <style>...</style> or as inline styles: <style="...">.
  • Specify encoding of output HTML.
  • Customize the document title; create only the HTML part between the <body>...</body> tags.
  • Customize the overall font, size, and color for the entire document.
  • Detect hyperlinks in the text and convert them to real ones.
  • Redefining the visibility of table borders.

Download


To see this functionality firsthand, download the freshest «RTF to HTML .Net» with code examples, 49.2 Mb.

Some examples to convert DOCX to HTML in C# and VB.NET

1. Convert DOCX file to HTML file in C#:

SautinSoft.RtfToHtml r = new SautinSoft.RtfToHtml();
string docxFile = @"d:\The Gift.docx";
string htmlFile = Path.ChangeExtension(docxFile, ".html");

r.OpenDocx(docxFile);
r.OutputFormat = SautinSoft.RtfToHtml.eOutputFormat.HTML_5;
if (r.ToHtml(htmlFile)==true)
{
	// Open HTML in browser.
	System.Diagnostics.Process.Start(htmlFile);
}

2. Convert DOCX to HTML in memory using C#; store images inside HTML using base-64.

SautinSoft.RtfToHtml r = new SautinSoft.RtfToHtml();
string docxFile = @"d:\The Gift.docx";
byte[] docxBytes = File.ReadAllBytes(docxFile);

// Let's store all images inside the HTML document.
r.ImageStyle.IncludeImageInHtml = true;
r.OpenDocx(docxBytes);
string htmlString = r.ToHtml();

3. Convert DOCX to HTML in VB.NET; make all CSS styles inline, i.e. put within the attribute <style="...">.

Dim r As New SautinSoft.RtfToHtml()

' Set HTML5 format.
r.OutputFormat = SautinSoft.RtfToHtml.eOutputFormat.HTML_5
' Set utf-8 encoding.
r.Encoding = SautinSoft.RtfToHtml.eEncoding.UTF_8
' Make all CSS inline.
r.TextStyle.InlineCSS = True

Dim docxFile As String = "e:\Petersburg.docx"
Dim htmlFile As String = Path.ChangeExtension(docxFile, ".html")
r.ConvertFile(docxFile, htmlFile)

4. Convert DOCX to HTML in C#; get the list with all images from DOCX.

SautinSoft.RtfToHtml r = new SautinSoft.RtfToHtml();
string docxFile = @"d:\The Gift.docx";
byte [] docxBytes = File.ReadAllBytes(docxFile);

// Let's store all images inside the HTML document.
r.ImageStyle.IncludeImageInHtml = true;

List listImages = new List();
r.OpenDocx(docxBytes);
string htmlString = r.ToHtml(listImages);

// Let's loop and and save all images to hdd.
int count = 1;
foreach (RtfToHtml.SautinImage img in listImages)
{
    img.Img.Save(String.Format(@"d:\image{0}.png", count));
    count++;
}

Technical information and requirements


Requires only .NET Framework 4.0 and up or .NET Core 2.0 and up. Our product is compatible with all languages .NET and supports all Operating Systems where .NET Framework and .NET Core can be used.

Note, that «HTML to RTF .Net» is entirely written in managed C#, which makes it absolutely standalone and an independent library.

.NET Framework, .NET Core
  • .NET Framework 4.0, 4.5, 4.6.1 and higher.
  • .NET Standard 2.0
  • .NET Core and .NET 5.0

Multi-platform component, runs on:

  • Windows
  • Linux
  • Mac OS
WindowsLinuxMac OS

Our component has proven itself on cloud platforms and services:

SharePoint Google Cloud AWS Microsoft Azure Docker
  • SharePoint
  • Google Cloud Platform
  • Amazon Web Services (AWS)
  • Microsoft Azure
  • Docker etc.