Let's assume your site is now working with option A. Conversion rate is 5.7%, measured over a month.
You're now running an A/B for 1 week with option A and another version, option B. You get a conversion rate of 6.5% for option A and 5.9% for option B. Normally, you'd say that 6.5% is better than 5.9%. But how sure can you be if you don't control the other factors and both are performing better than before? How many visitors do you need to offset the influence of offsite factors?
Set your site up to be able to handle option A and option B at the same time, randomly assigning visitors to each. Now you don't need to control for anything, because you can just compare the A-visitors to the B-visitors.
In your case, option A started doing a lot better when you started running the test, which is moderately surprising, so maybe you messed something up in user assignment, tracking, or something. But if your framework is good, then it sounds like an external change. You still want to look at how A's rate and B's rate compare for the time period when you were randomizing, so 6.5% vs 5.9%, with the earlier 5.7% being more or less irrelevant.
You do need to do significance testing to see how likely you would be to get this wide a difference by chance. The easiest way to do that is to enter your #samples and #conversions into a significance calculator [1], and the p-value tells you how likely this is (assuming you didn't have any prior reason to expect one side of the experiment to be any better).
You're now running an A/B for 1 week with option A and another version, option B. You get a conversion rate of 6.5% for option A and 5.9% for option B. Normally, you'd say that 6.5% is better than 5.9%. But how sure can you be if you don't control the other factors and both are performing better than before? How many visitors do you need to offset the influence of offsite factors?